SlideShare a Scribd company logo
1 of 19
Tel Aviv University
            Raymond and Beverly Sackler Faculty of Exact Sciences



Classification of Traffic Sources on The Internet
               A study of the adaptability level of network agents



 Thesis Submitted as partial fulfillment of the requirement towards the M.Sc. degree


                            School of Computer Science
                               Tel Aviv University




                                        By

                                     Uri Gilad



   This research work in this thesis has been conducted under the supervision of
        Prof. Hezy Yeshurun, Tel Aviv University (Coordinating Instructor)
             Dr. Vern Paxson, International Computer Science Institute



                                    May 2005.
Abstract
        The subject of investigating the different types and purpose of network intruders
has received relatively little attention in the academic community, unlike the extensively
researched area of identifying intrusions. In this paper, we examine how adaptive to
changing environments intruders are. We investigate the different groups that form when
we rank intruders according to their adaptability. This research lends new understanding
about the level of threat presented by highly adaptive intruders.

        The following paper is organized as follows: first we examine the current work in
intrusion detection and related subjects. We establish ground truth by performing an
experiment in controlled environment. We present and justify our methods of data
collection and filtering, and move on to present results based on real data. Finally we
outlay our conclusions and discuss methods of further research.

      The resulting algorithm is a useful tool to rank the intruders attacking a defended
network according to adaptability levels.




                                            2
Contents
Classification of Traffic Sources on The Internet...........................................................1
Abstract..............................................................................................................................2
Contents..............................................................................................................................3
1. Introduction....................................................................................................................4
2. Algorithm........................................................................................................................8
3. Results...........................................................................................................................31
4. Discussion.....................................................................................................................46
5. Further work................................................................................................................49
References.........................................................................................................................51
‫75..................................................................................................................................תקציר‬




                                                                   3
1. Introduction
         Present day computer security technologies are focused on detecting and
protecting against intruders. While there are many solutions for network defense, there
has been surprisingly little research on the subject of understanding the nature of
attackers. Present day Internet traffic contains a variety of threats, all of which may direct
traffic at any random host connected to the network. The traffic originates from sources
such as worms, attacking hackers, automated scanning tools and legitimate traffic.

        The rough classification above begs the question: Can one automatically
categorize and determine the type of a traffic source. While it may seem secondary, the
nature of an attacker is paramount to understanding the seriousness of a threat. Attacks
that come from worms and scanning tools are “impersonal” and will not “insist” on
lingering at the specific site while a human may return again and again. A human can
employ a wider array of techniques and keep hammering at a site until finally brute-
forcing a way inside. In addition, humans can be subject to retaliation by contacting the
responsible party, while the threat from autonomous agents does not have a clear identity
behind it to contact.


1.1 Traffic Sources on the Internet
       A random host connected to the Internet will be scanned and attacked in a short
while [21]. Most of the attacks come from automated agents, due to their overwhelming
numbers. Automated agents can be worms, human written software pieces. Worms are
designed to propagate between computer systems by “infecting” a vulnerable machine
using a built-in exploit. After infection the worm places a copy of itself on the targeted
system and moves on to infect the next victim. The worm thus spawned on the infected
system enters the same cycle. Research suggests that probes from worms and autorooters
heavily dominate the Internet [37].

        As mentioned above, another source of traffic on the Internet is automated
scanning tools such as the abovementioned autorooters. Autorooters are designed to
quickly inspect as many hosts as possible for information – without the propagation
feature. There are benign scanning tools such as web crawlers – designed to collect
information for indexes such as Internet search engines. Other scanning tools, however,
are designed to seek vulnerabilities to exploit. The people behind the last type of
automated scanning tools use the data gathered to return and break into a site found to be
vulnerable. Autorooters, an evolution of these automated vulnerability scanners, actually
install a backdoor automatically on a vulnerable machine and report the machines IP
address back to the human operating them, who can now take control of the host.




                                              4
Rarely seen on the Internet are fully manual sources of hostile traffic, adaptive
sources actually interacting with a specific network site in order to break into it. There is
a distinction between this adaptive (possibly human) traffic sources and the above two
rough classifications. While humans may employ worms and automated scanning tools,
sometimes the human hacker will manually work against a site, with the intention of
achieving better results by employing a wider range of techniques.

        This paper presents a practical method to classify sources of attack on the
Internet. The classification is believed to be helpful when determining the nature of an
attacker – is it an automated agent or a manual one (a human). The development and
experimentation is based on actual data from several distinct sites.


1.2 Related Work
        The subjects of examining the behavior of an attacker on the Internet,
understanding attackers nature or classifying the attacker by behavior have not received
much attention from the academic community. However, the related subjects of
differentiation between humans and computers and the subject of detecting worm
outbreaks (a subset of the larger subject of intrusion detection) have been researched
extensively. Several papers touch upon this work's ideas.

         Telling computers and humans apart is a problem discussed by Von Ahn et al. [1].
The researchers describe a technique dubbed "CAPTCHA" (Completely Automated
Public Turing Test to Tell Computers and Humans Apart). As the acronym implies, the
work details an automatic algorithm to generate a test that human beings can pass, while
computer programs cannot. The test is based upon "hard to solve AI problems" that
human beings can solve easily, while computers cannot. The Definition of "Hard to
solve AI problems" relies on consensus among researchers, and therefore the problem set
may change in the future. The proof that human beings can solve these problems is
empiric. The work outlined in this paper and the CAPTCHA paper share an interest in
telling humans and software agents apart. While the CAPTCHA solution is taken from
the domain of AI, and relies on voluntary test takers (if the test is not passed, resources
will be denied) this paper will attempt to present an algorithm that deals with more severe
restrictions. The differentiation accomplished here is performed without knowledge or
cooperation of the tested subject.

         In [7] Staniford et al. discuss theoretical advanced worm algorithms that use new
propagation mechanisms, random and non-random. Discussed are worms equipped with
a hit-list of targets to infect, worms sharing a common permutation to avoid repeating
infection attempts and worms that study the application topology (for example:
harvesting email addresses) to decide which computers to target for infection. Staniford
and co. envision a “cyber center for disease control” to identify and combat threats. In
this paper, we touch upon the task of identifying the threat and categorizing it as an
adaptive source attack or an automated agent that may be part of a worldwide infection.
The subject of “non-random” agents (such as those worms equipped with a hit-list – for



                                             5
example) is important, as the issue of identifying the origin of that attack (a human who
may change tactics or a worm) is of importance.

        Zou et al. [2] present a generic worm monitoring system. The system described
contains an ingress scan monitor, used for analyzing traffic to an unused IP address
space. The researchers observe that worms follow a simple attack behavior while a
hacker’s attack is more complicated, and cannot be simulated by known models. They
suggest using a recursive filtering algorithm to detect new traffic trends that may indicate
a case of worm infection. Zou’s observation that hacker and worm behavior is different
is not expanded upon. The observation is implicitly used, as the paper describes a
method for intruder classification based on their behavior. This work will present an
algorithm that uses the mentioned difference to group intruders with similar behavior.

        In [3] Wu et al. expands upon the subject of worm detection mentioned in [2].
Wu discusses several worm scanning techniques and re-introduces the subject of
monitoring unassigned IP address space. Jiang Wu and co. discuss several worm
propagation algorithms similar to those presented in [7]. The authors discuss worm
detection, and propose the hypothesis that random scanning worms scan the unassigned
IP address space and that this fact may present a way to detect them. The authors search
for common characteristics of worms (as opposed to other agents), one such common
characteristic is that a worm will scan a large number of unassigned IP addresses in a
short while. Wu and co. suggest an adaptive threshold as a way to detect worm
outbreaks. Examining the algorithm results on traffic traces validates the work, and the
conclusion is that unknown worms can be detected after only 4% of the vulnerable
machines in the network are infected. Unused IP address monitoring is employed in this
research also, this research draws upon the common characteristic of worms to detect
“automaton-like” behavior, as opposed to more random or adaptive behavior.

        In [17] Jung et al. present an algorithm to separate port scanning from benign
traffic by examining at the number of connections that are made to active hosts vs. the
number of connections made to inactive hosts. Their observation is that there is a
disparity on the number of connections made to active hosts between benign and hostile
sources. The disparity in the access attempts to active and inactive hosts is implemented
in an algorithm named TRW (Threshold random walk), which is used to successfully and
efficiently detect scanning hosts. The research by Jung et al. brings to light an important
difference between hostile scanners and benign users, which touches on this research. In
this research an attempt is made to establish a range of behaviors, ranging from fully
automated agents (such as worms) to attackers who react to environmental changes and
modify their algorithm of attack accordingly – a behavior indicative of humans. The
work done in [17] assumes certain behavior on the part of users – they will access more
active sources than inactive sources. This research relies on the diversity of resources
accessed as another characteristic useful to classify attackers. In [18] Weaver et al. show
that even a simplified version of TRW achieves quite good results, which emphasizes the
point that the difference in behavior between naïve and hostile sources is of importance.




                                             6
Spitzner [4] describes the honeypot: a system that has no production value and
thus it can be assumed that any access to it is illegitimate. In [5] Spitzner makes the
distinction between behaviors of different hackers and bases this distinction on their
behavior – as can be evident by their actions when attacking a network. This work is
related to honeypot technology as the working assumption that most traffic directed at
unused IP space is hostile is used. Further, Spitzner notes the classification of Internet
hostiles based on their activity, which is one of the foundations for this research. Spitzer,
however, does not expand on the idea of classification beyond describing a specific group
of Internet “operatives” dubbed “script kiddies”. The intention of this research is to go
beyond that and provide a method of telling apart different hostile sources.

        Lee [6] presents a data mining approach to intrusion detection. The approach
presented by Lee is to apply current machine learning techniques to network sessions,
extracting features and analyzing them – finally building a decision tree to separate
intrusion attempts from legitimate traffic. Building decision trees to aid in classification
of traffic is a recognized method, but in this research we chose to rely on expert
knowledge as this approach is assumed to produce better results. In Lees research the
resulting detection rate of below 70% is claimed to be unsatisfactory. The low success
rate in Lee’s work encouraged us to try the presented approach.


1.3 Rationale for this study
         In this paper, we present a practical algorithm to classify intruders according to
their activity as captured at the targeted network. The classification defines a range
between automated attacks and fully manual sources of attack such as human hackers.
This kind of classification is touched upon in some works, but is not explored to the
fullest extent.

        The algorithm is based on tested hypotheses on human and worm behavioral
differences. The method relies on past behavior of simple worms in order to find the
common denominator of the behavior of automated attackers. After determining this
common denominator, attackers can be ranked and differentiated.




                                             7
2. Algorithm
2.1 Establishing ground truth
        The research presented below is based on the assumption that there are core
differences between the traffic generated by a fully manual source of attack such as a
human and fully automated source of attack, such as a worm. This assumption was tested
in an experiment before embarking on the development of a full-scale algorithm.

2.1.1 Experiment
        An initial algorithm was developed to test attacker behavior. The algorithm goes
over a pre-recorded network traffic dump. For each traffic source, the algorithm
calculates the series of ICMP echo requests and TCP/UDP ports the source accessed on
each destination address. For each new target the algorithm processes for this source, the
resulting access series is compared to those already performed by the source on different
targets. If the access series is different, it is added to the list.

        It is proposed that a higher number of different access series for a specific source
is indicative of the ability to react to changes, and possibly the guidance of a human. The
basis for this assumption is that we believe a human will react to different hosts in a
different way, producing a different access series, while automated sources will act upon
a simple deterministic algorithm programmed beforehand, independent of the conditions
in the network currently attacked.

                This algorithm is used to test the access series for humans and simple
automatons/network worms. This is not the same as testing the behavior of an attacker to
the fullest extent, including examining the data passed in the various sessions in the
traffic collected. We believe that establishing that humans and worms differ in their
“access series” to targets is a good argument towards proving that there is a difference in
the behavior between an automaton and a more manual source. This conclusion is valid
because a different access series is an example of different behavior.

        For the remainder of this paper, we term the number of different access series for
a single source as an “adaptability score”. The method of calculating an access series for
an intruder is given under section 2.3 – Algorithm details.




                                             8
2.1.2 Experiment layout
        The experiment was composed of two stages. The first stage – the algorithms
response to the activity of human attackers was tested, humans being an example of a
highly adaptive agent. In the second phase the algorithms response to a network worm in
a lab environment was tested. Network worms with their simple algorithms are examples
of completely automatic agents.

2.1.2.1 Testing Human subjects
        A group of volunteers was collected for testing the assumptions about human
behavior. Besides being human, the test subjects had to be knowledgeable security
experts, with practical experience in penetration testing (i.e. breaking into company web
sites for testing and improving their security). The human test subjects were given the
task of breaking into a prepared site.

       Only the IP address range was provided, without mention of the purpose of the
experiment and what the defense mechanisms are employed at the site.

        The site is protected by an ActiveScout [16] machine, which creates virtual
resources for the humans to interact with. The test subjects are able to connect to
interactive and transaction based services such as FTP, NetBIOS, HTTP and Telnet. The
virtual site presented appears to contain some “security holes” in it in the form of
vulnerable services and open ports. The vulnerable services show a welcome banner that
clearly proclaims an aged version of a common server application, one that is known to
the security community to be open for exploitation and can be abused to break into the
hosting computer. Most of the versions reported in the welcome banners have publicly
available exploits.

        The protection of the site is configured so that after scanning for ports, much
more “virtual” resources than real resources will answer, and that accessing (as opposed
to scanning) these virtual resources will cause the attacker to be locked out of the site for
a short duration of time (4 hours).

       As we are using a commercial product to implement the experiment, the lockout
period and conditions are configurable. Specifically, in this experiment lockout occurs
under the following conditions:
       1. If the intruder accesses a NetBIOS based resource, lockout will occur after
           trying a user/password combination, but enumerating servers, users and other
           NetBIOS resources does not trigger lockout.
       2. If the intruder accesses an http server, the lockout will not be triggered.
       3. After the intruder completes a handshake with a TCP based simulated
           resource – lockout will occur.
       These are the default settings on the product, which are claimed to provide
maximum protection while ensuring a low number of false positives.




                                             9
A real web server was installed in the site to serve as the “trophy” to be found.
This web server – a Linux machine running an old (unpatched) version of apache, was
vulnerable to several well-known weaknesses. A hacker finding (and breaking into) this
web server will find a message that will notify him of his success.

        The structure of the experiment would appear to introduce only minor bias. The
experiment only approximates a “double blind” experiment, as the tester knows that the
attackers are human. The test subjects are not aware which algorithms are being used, or
what is being tested. The researcher is running an algorithm that is not aware of the type
of test subject. Some bias may have been introduced when designing the test
environment, however, as the site is obviously not a “real” one. There are no corporate
assets in the site, there is no real content beyond that which is provided by the
ActiveScout machine, and there are no legitimate uses for this site. These facts make the
site markedly different than a commercial presence on the web. However, the site can
pass for a bare bones website, and the attackers are employing the same techniques they
use when penetrating “real” web sites. Thus, the experiment will record human attacker
activity.

2.1.2.2 Testing automated agents - worms
        To test the assumptions about automated agent behavior, a representative group of
current and past network worms was analyzed. Worms are an example of an automated
agent, as a static pre-defined algorithm dictates a worm’s behavior. The worm
experimentation is logistically simpler than running the experiment with humans, since
most worms’ algorithms are well known. Analysis of each worms state machine
produced enough knowledge about its expected access series. The purpose of testing
worms is to understand how a very simple automaton behaves. Worms represent the
simplest automatons available.

2.1.3 Experiment results, human subjects
       We had each participant write down a journal while conducting the test,
containing the activities performed, the tools used and his conclusions.

       All human subject achieved scores higher or equal to 3 on the adaptability scale,
meaning they had 3 or more different access series for the targets they accessed. While
worms ordinarily had an adaptability score of “1” – as they perform according to a
deterministic state machine.

        Below is a summary of the human experts results, some comments for each, and
the resulting adaptability calculation.




                                           10
Test Subject #1
    1. Test subject #1 started the experiment by researching the Internet Whois database
       and DNS to find out information about the site.
    2. The test subject continued the experiment with port scans for various common
       ports. These include FTP, SSH, HTTP, NetBIOS, and some other ports – which
       include 6000/TCP (X-Windows) and the range 1-1024/TCP (looking for other
       open services). These scans were conducted on the entire range supplied, in
       deterministic, sequential order.
    3. The test subject followed the above scans by manually connecting to SSH and
       HTTP open ports, and querying the services for versions and banner information.
    4. The test subject attempted to browse the HTTP sites found, but was unable to due
       to the fact that the ActiveScout [16] had by this time determined it to be hostile,
       and locked him out of the site for the duration of 4 hours.
    5. The test subject did not realize at first that he was locked out – and continued
       various attempts, finally giving up and returning in 8 hours, to accomplish
       basically the same results.
    6. The test concluded after the second scan.
    7. The test subject was able to determine which of the simulated computers was a
       “real” web server, but did not manage to break into it. The test subject arrived to
       this conclusion after noting the different content of the real web site and the
       simulated web sites. Following up on this hunch, the test subject mapped out the
       TCP/IP fingerprints for the simulated web site and the real web site, and found a
       bug in the implementation of the simulated web site. The simulated web site,
       while claiming to be the same OS model as the real web site had a different
       response to specific TCP packets.
Discussion
       From the log this test subject kept, we learn that he attempted to use outside
information (DNS, Whois) to learn about the site before accessing it, attempting to glean
information about services and servers available before doing the actual scan. This is the
kind of behavior rarely seen in automated agents to date, although possible, most
automated agents do not tend to seek outside information about the site being attacked, an
exception are topologically aware worms discussed below.

        The classification thesis proved robust against this intruder, as the test subject was
attracted to open ports and attempted to penetrate them in various ways in order to break
into the site. The test subject did not follow a specific plan once the attack passed the
“reconnaissance” phase. This fact contributed to the adaptability and to a greater score in
the algorithm.

The access series for this test subjects appear below in figure 1. These port groups are
counted towards an adaptability score of 5.




                                             11
1.   22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP,             22/TCP
   2.   22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP,             22/TCP,80/TCP
   3.   22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP,             80/TCP
   4.   22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP,             22/TCP,6000/TCP
   5.   1-1024/TCP,6000/TCP

   Score = 5

  Figure 1: test subject 1 port groups.

Test subject #2
   1. Test subject started the experiment by making a select few connections to FTP,
       HTTP, NetBIOS and ports 467/TCP and 484/TCP. It is interesting to note that
       ports 467/TCP and 484/TCP are not used by any known protocol. When later
       queried about this, the test subject replied that the ports were selected without a
       practical reason but as an attempt to confuse an operator of the site, if there is any.
   2. These selective connections to specific ports suggest that the test subject made an
       earlier reconnaissance attempt from a different source address. The test subject
       confirmed this suspicion in a later interview.
   3. After a break of four days, the test subject returned from the same source address,
       and performed a “ping sweep” of the available address space.
   4. The test subject also performed scans to ports 98/TCP and 96/TCP.
   5. Finally, the test subject concluded with a horizontal NetBIOS scan.

Discussion
        In a later interview, the test subject confirmed that in the four days he did not
actively attack the site, he contemplated the problem and deduced that a network device
of some sort protects the site. The test subject did not divine the nature of the “network
device” – but his later behavior is explained by this deduction. Apparently he realized
that the device is reacting to scans. He attempted to access unusual ports in order to test
the device’s response. Failing to find any weakness, and receiving what seemed like fake
results from the NetBIOS scan, he gave up.

        The access series output by the algorithm is summarized in figure 2 below.

        1.   137/UDP 80/TCP ICMP
        2.   137/UDP 21/TCP ICMP
        3.   98/TCP 96/TCP
        4.   484/TCP 467/TCP

   Score = 4

  Figure 2: Test subject 2 port groups.




                                             12
Test subject #3
   1. Test subject 3 started the experiment by automatically mapping out the entire
       network range.
   2. The test subject than followed up on several ports in random order, concentrating
       on 22/TCP (SSH) 80/TCP (HTTP) and 21/TCP (FTP).
   3. The scan included attempts to determine the operating system of the attacked
       virtual computers by testing TCP flags – using an automated tool called “nmap”
       [20].
   4. After every connection attempt to a responsive “virtual host” the test subject
       would be blocked out of the network for a period of time.
   5. This behavior of the network frustrated the test subject, and the experiment was
       concluded.

Discussion
        While the experiment with this test subject itself did not provide a large amount of
data to analyze, as the test subject gave up quickly on the test itself, the establishment of
ground truth for the research benefited. The test subject, although not very thorough, did
work in an adaptive way, returning to those ports that were responsive in the scanned
hosts. The test subject spent more time on responsive resources, and tried to determine
versions of applications and operating system.

        The test subject’s operating mode was completely adaptive, and summarized
below are the different ports series output by the algorithm for this test subject. The score
for this test subject is low, 3, and can be explained by his lack of interest and frustration.
Although low, this score is still higher than the score awarded to automated agents,
below.


       1. 22/TCP 21/TCP 22/TCP 80/TCP
       2. 21/TCP 22/TCP 21/TCP 80/TCP
       3. 21/TCP 22/TCP 80/TCP

   Score = 3


  Figure 3: Test subject #3 port groups.




                                             13
2.1.4 Testing human subjects, conclusions
        Several conclusions arise from interviewing the test subjects and analyzing their
actions. It appears that all test subjects became bored at one stage or another with the
work. The fact that this was volunteer work, along with their growing suspicion that
“something fishy” is going on, contributed to their wrapping up the experiment quicker
than they would if their work was paid for or if their work was for personal interest – two
common motivations for human hackers. This can lead to another indicator for human
activity, but this research will not focus on it.

        All test subjects reported a suspicion that “something fishy” is going on. This
feeling was caused by the fact that the ActiveScout protection mechanism used generated
resources in response to scans, and blocked users for a period of time after they have
proved to be hostile by accessing the generated resources. All test subjects eventually got
blocked. The blocking was released automatically after a period of time. This behavior
contributed to the frustration of the test subjects as the site did not seem to provide a
consistent image of resources over time. Frustration is not a trait of automated agents.
While the test subjects are familiar with mechanisms that block offensive users, they
were not familiar with mechanisms that work by detecting access to “virtual” resources –
but rather signature based mechanisms.

        Another conclusion is that hackers may not immediately follow the
reconnaissance phase with an attack phase. Some hackers will take the results and
analyze them manually, and than employ exploits against each vulnerable spot at their
leisure. Additionally, these separate stages may come at the target from different source
IP addresses, due to the use of DHCP or even due to an active attempt to disguise the
origin of an attack. During the reconnaissance phase, when hackers employ automated
scanning tools they are still considered to be automated agents. When hackers turn to
selectively attacking scanned resources – they will appear to be manual sources of attack.

         The algorithm proved robust even if hackers perform the reconnaissance phase
from a different source address as the source address they attack from. The
reconnaissance source address will be detected (correctly) as an automated source, and
the attack will be detected as a more adaptive source. Running several automated tools
from the same source IP can cause enough adaptability to be counted as a manual source,
which is as expected – if the attacker employs a fully automated script it will provide
different results than if the attacker runs several scanning tools in response to results from
the site.

        While human test subjects in this experiment employed automated tools for most
of the reconnaissance phase, they performed the attack phase manually. This result is
enhanced by the fact that the availability of scanning tools is much greater than the
availability of automated “autorooter tools” – tools that perform an attack automatically.
Autorooters perform the attack cycle from beginning to end, finally providing the user
with a list of compromised hosts. Most of the “attack tools” available are usually tools
which break into a single host, and the user is left to decide what to do after the break-in
on his own. A reasonable assumption would be that users that write even more advanced



                                             14
attack tools – show even greater adaptability and randomness of actions when finally
doing manual actions on their own.

2.1.5 Analysis of recent network worms as examples for automated
      agents
        To complete the process of establishing ground truth, we need to look at the other
side of the spectrum, at fully automated sources of attack. The most common automated
agent is the network worm. The following is a study of the recent well-known worms
discovered on the Internet.

        We make the distinction between email worms and network worms. The
difference between these two kinds of malicious agents is that email based worms usually
require some sort of human interaction – such as opening the mail message and executing
an attachment.

        The claim we need to establish is that network worms have a simple state
machine, resulting in a predictable order of actions. A simple, predictable behavior will
result in a low adaptability score. The worms studied are summarized below – for each
worm, the adaptability score is calculated according to the worm’s algorithm. Included is
the list of ICMP echo requests and TCP/UDP ports the worm accesses, this list is the
basis for the adaptability score calculated.

        In some cases, a copy of network traffic from a captured host was not found, and
we had to rely on public knowledge bases such as SANS [38] and CERT [39]. This
knowledge was used to divine the worms’ algorithm, for which an adaptability score was
calculated.

2.1.5.1 Sadmind
First report by CERT: May 10, 2001 [22].
        The Sadmind worm (also named PoisonBox.worm in some resources) is a worm
that employs an exploit for a Solaris service (Sadmind) to spread. Once a machine is
infected, in addition to propagating, the machine will scan for and deface IIS web servers.
This is an example of a multimode worm, although the second mode is used for defacing
and not propagation.
        Since the IIS infection is performed in conjunction with the propagation process,
there are two possible port series for this worm. The worm will attempt the same series of
ports for all victims, be they exploitable Solaris workstations or defaceable IIS machines.
    The port series calculated for this worm are:
    1. 111/TCP, 600/TCP this is the propagation phase – the first port is the portmapper
        service used for exploitation, the exploit will open a backdoor on 600/TCP port.
    2. 80/TCP – the port used to launch an exploit against the IIS machines, which will
        than download the defaced web pages from the attacking machine.

Adaptability score for this worm would be 2, due to the additional feature of webpage
defacement – a new series of actions attempted against hosts independently of the first
series.


                                            15
2.1.5.2 Code Red I & II
        The code red worm has several variants, but the security community makes a
distinction between two major variants, sharing the use of the same vulnerability:
Code Red first variant was reported by CERT on Jul. 19, 2001. [23]
Code red II report by CERT: Aug. 6, 2001.
        Code Red connects to the HTTP port of any victim and sends a specially crafted
request containing the worm itself, the worm code will execute from the stack of the
exploited process. This worm’s state machine is simple, and consists of two stages: The
generation of a random IP, and the probe/attack/spread phase, which is contained within
the malicious payload.

The list of ports contains only port 80/TCP – resulting in an adaptability score of 1.


2.1.5.3 Nimda
First report by CERT: Sep. 18, 2001. [26]
Nimda is a multi-mode worm, which spreads by either:
    1. Infecting files in open NetBIOS shares.
    2. Infecting web pages present on the attacked computer, thereby spreading to
        unsuspecting web clients.
    3. Replication by sending itself in an executable attached in an email message from
        the infected computer.
    4. Attacking IIS machines, using a weakness of the HTTP server.
    5. Exploiting the backdoor left by code red II and Sadmind.
The Nimda worm will gain a score of 2 with the algorithm, as the worm’s algorithm
attacks the following ports groups:
    1. 80/TCP [IIS & Code Red backdoor]
    2. 445/TCP, 139/TCP [open NetBIOS shares]

     The web page infection, and the infecting email cannot be seen by our chosen IDS.
Although this worm has several modes of attack, it’s algorithm is based of a simple state
machine, even taking into consideration the modes not seen by our IDS, the worm will
still retain a low score of 3. This worm does however merit a discussion of multi-mode
worms – below.

2.1.5.4 Spida
First report by CERT: May 22, 2002. [24]
        The Spida worm will connect to Microsoft SQL server and exploit a default
“null” password for an administrative account in order to propagate and spread.

       The worm operates entirely over 1433/TCP, providing it with an adaptability
score of 1.




                                             16
2.1.5.5 Slapper
First report by CERT: Sep. 14, 2002. [25]
        The Slapper worm will test systems for mod_ssl (the vulnerable web server
module) by grabbing the web page banner from 80/TCP. If mod_ssl is present, the worm
will connect to port 443/TCP, and launch an exploit. Due to the fact that this worm is
Linux based, the worm downloads and recompiles it’s code on the infected system, as
opposed to other worms that use static binaries.

      Since the worm operates in a pre-defined port order for all victims, (80/TCP,
443/TCP) the worm has an adaptability score of 1.

2.1.5.6 Slammer
First report by CERT: Jan. 27, 2003. [27]
        The worm sends a specially crafted UDP packet to SQL services – which will
cause the SQL server start sending the same packet to random IP addresses in a never
ending loop. The worm’s state machine is extremely simple and is similar to the one
employed by Code Red II above – generate an IP address and probe/attack/spread.

      This malware has Adaptability score of 1, where the port series includes simply
1434/UDP.


2.1.5.8 Blaster
First report by CERT: Aug. 11, 2003 [28]
        Blaster attacks by exploiting a weakness in Microsoft DCOM RPC [41] services.
The worm connects to port 135/TCP, and the compromised host is instructed to “call
back” to the attacker and retrieve the worm code.

The worm operates on this outgoing port only, gaining it a score of 1.

2.1.5.9 Welchia/Nachi
First report by CERT: August 18, 2003
        Shortly after blaster was released, a worm dubbed “Welchia”[40] was unleashed.
This worm was especially interesting, as it seems it was written with the intent of being a
“benign worm”. When Welchia successfully infects a host, it will first seek and remove
any copies of the “Blaster” worm. Additionally, the worm will attempt to install the
relevant patch from Microsoft. Welchia exploits two vulnerabilities in Microsoft
systems: the RPC DCOM vulnerability used by blaster, and vulnerability in NTDLL
commonly known as the “WebDAV” vulnerability. The worm will seek one of 76 hard-
coded class B networks when attempting to infect with the WebDAV vulnerabilities.
Presumably, the worm’s author scanned these networks beforehand, as the networks are
owned by Chinese organizations, and the WebDAV exploit bundled with Welchia only
works on some double-byte character platforms, Chinese being one of these vulnerable
platforms.



                                            17
As Welchia targets different hosts with it’s two exploits, the worm will gain an
adaptability score of 2.

2.1.5.10 Sasser
First report by CERT: May 1, 2004
Reported by Symantec on April 30, 2004.
        The Sasser worm operates in way not dissimilar to Blaster. The worm will exploit
an RPC service on the Microsoft Windows platform, the exploit will cause a remote
command backdoor to be opened, through which the worm will instruct the victim to
download and execute the malicious payload.

       The port series for this worm is 445/TCP, with 9996/TCP for the remote backdoor
created. Some variants use ICMP ping request to speed up the process but all variants
known follow a single predetermined port series for all victims, gaining the worm a score
of 1.


2.1.5.11 Santy/PHP include worms.
First report by CERT: December 21, 2004
        The Santy worm is an example of a topological worm. This worm gathers a “hit
list” of sources to attack from a Google query (later variants use other search engines)
looking for a vulnerable version of a PHP (html scripting language used to present
dynamic web pages) application, specifically phpBB – a bulletin board package.
        Since this worm will not perform any reconnaissance. It will bypass our chosen
IDS system. Active Response Technology depends on intruders using baits distributed
during an early reconnaissance phase. However, going over a captured traffic from this
worm, shows that the worm will perform a simple series of actions, all over the HTTP
port.
        The algorithm will award this worm an adaptability score of 1, as it does not
divert from this order of actions, nor does it try any other services.



2.1.6 Multi mode worms, discussion
        As seen above, most worms have a very simple state machine and they tend to
follow a very specific order of actions for each host attacked. However, there are worms
that will attempt several exploits against a single target, examples from the above list are
Nimda and Sadmind. Such worms will gain a higher adaptability score if they attempt a
different set of attacks for each host. Multi mode worms will still linger a very short
while with each victim, and they will still follow a pre-defined order of actions, although
their algorithm will be more complex. Multi mode worms tend to have a lower number
of variants, and are generally rare (although this trend may change in the future).




                                            18
2.1.7 Topologically aware worms, discussion
        Topological worms are worms that use additional sources of information when
deciding which victim to attack. An example is Santy (above). Such worms are not more
or less adaptive than other worms, as the process of choosing victims is independent from
the process of attacking each victim.

        These worms do present a problem with this research, our chosen IDS depends on
an intruder (be it an adaptive source such as a human or an automatic source) doing some
reconnaissance before attacking. If a worm is topologically aware, it may be able to
ignore the virtual resources the Active Response technology sets as baits by using
information gleaned elsewhere – such as a search engine.

        This problem is mitigated by the fact that looking at a traffic dump of a source
infected by a topologically aware worm, an adaptability score can be determined
manually.




                                           19

More Related Content

What's hot

Secure and Reliable Data Transmission in Generalized E-Mail
Secure and Reliable Data Transmission in Generalized E-MailSecure and Reliable Data Transmission in Generalized E-Mail
Secure and Reliable Data Transmission in Generalized E-MailIJERA Editor
 
NETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATION
NETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATIONNETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATION
NETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATIONIJITE
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKijcseit
 
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLS
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLSA SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLS
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLSIJNSA Journal
 
2011 modeling and detection of camouflaging worm
2011   modeling and detection of camouflaging worm2011   modeling and detection of camouflaging worm
2011 modeling and detection of camouflaging wormdeepikareddy123
 
Optimised malware detection in digital forensics
Optimised malware detection in digital forensicsOptimised malware detection in digital forensics
Optimised malware detection in digital forensicsIJNSA Journal
 
11.a genetic algorithm based elucidation for improving intrusion detection th...
11.a genetic algorithm based elucidation for improving intrusion detection th...11.a genetic algorithm based elucidation for improving intrusion detection th...
11.a genetic algorithm based elucidation for improving intrusion detection th...Alexander Decker
 
1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...
1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...
1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...Alexander Decker
 
Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...
Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...
Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...IJCSIS Research Publications
 
Analysis and detection of computer viruses and worms
Analysis and detection of computer viruses and wormsAnalysis and detection of computer viruses and worms
Analysis and detection of computer viruses and wormsUltraUploader
 
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...gerogepatton
 

What's hot (14)

Secure and Reliable Data Transmission in Generalized E-Mail
Secure and Reliable Data Transmission in Generalized E-MailSecure and Reliable Data Transmission in Generalized E-Mail
Secure and Reliable Data Transmission in Generalized E-Mail
 
NETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATION
NETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATIONNETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATION
NETWORK INTRUSION DATASETS USED IN NETWORK SECURITY EDUCATION
 
A0430104
A0430104A0430104
A0430104
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
 
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLS
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLSA SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLS
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLS
 
2011 modeling and detection of camouflaging worm
2011   modeling and detection of camouflaging worm2011   modeling and detection of camouflaging worm
2011 modeling and detection of camouflaging worm
 
Optimised malware detection in digital forensics
Optimised malware detection in digital forensicsOptimised malware detection in digital forensics
Optimised malware detection in digital forensics
 
11.a genetic algorithm based elucidation for improving intrusion detection th...
11.a genetic algorithm based elucidation for improving intrusion detection th...11.a genetic algorithm based elucidation for improving intrusion detection th...
11.a genetic algorithm based elucidation for improving intrusion detection th...
 
1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...
1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...
1.[1 9]a genetic algorithm based elucidation for improving intrusion detectio...
 
Kg2417521755
Kg2417521755Kg2417521755
Kg2417521755
 
G0434045
G0434045G0434045
G0434045
 
Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...
Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...
Hybrid Feature Classification Approach for Malicious JavaScript Attack Detect...
 
Analysis and detection of computer viruses and worms
Analysis and detection of computer viruses and wormsAnalysis and detection of computer viruses and worms
Analysis and detection of computer viruses and worms
 
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...
A SURVEY ON DIFFERENT MACHINE LEARNING ALGORITHMS AND WEAK CLASSIFIERS BASED ...
 

Viewers also liked

Day3 Backup
Day3 BackupDay3 Backup
Day3 BackupJai4uk
 
It For Dummies Kamens 081107
It For Dummies Kamens 081107It For Dummies Kamens 081107
It For Dummies Kamens 081107kamensm02
 
NH Bankers 10 08 07 Kamens
NH Bankers 10 08 07 KamensNH Bankers 10 08 07 Kamens
NH Bankers 10 08 07 Kamenskamensm02
 
Tcpip fund
Tcpip fundTcpip fund
Tcpip fundSteve Xu
 
A Reflection on my time as a grad
A Reflection on my time as a gradA Reflection on my time as a grad
A Reflection on my time as a gradLarry Jennings
 
5.Dns Rpc Nfs
5.Dns Rpc Nfs5.Dns Rpc Nfs
5.Dns Rpc Nfsphanleson
 
Análisis e investigación del diseño arquitectónico
Análisis e investigación del diseño arquitectónicoAnálisis e investigación del diseño arquitectónico
Análisis e investigación del diseño arquitectónicoAngelica Gonzalez
 

Viewers also liked (12)

Wire frame full
Wire frame fullWire frame full
Wire frame full
 
Day3 Backup
Day3 BackupDay3 Backup
Day3 Backup
 
It For Dummies Kamens 081107
It For Dummies Kamens 081107It For Dummies Kamens 081107
It For Dummies Kamens 081107
 
NH Bankers 10 08 07 Kamens
NH Bankers 10 08 07 KamensNH Bankers 10 08 07 Kamens
NH Bankers 10 08 07 Kamens
 
Macs e portfolios
Macs e portfoliosMacs e portfolios
Macs e portfolios
 
Tcpip fund
Tcpip fundTcpip fund
Tcpip fund
 
A Reflection on my time as a grad
A Reflection on my time as a gradA Reflection on my time as a grad
A Reflection on my time as a grad
 
5.Dns Rpc Nfs
5.Dns Rpc Nfs5.Dns Rpc Nfs
5.Dns Rpc Nfs
 
Byte Me Report
Byte Me ReportByte Me Report
Byte Me Report
 
business
businessbusiness
business
 
Análisis e investigación del diseño arquitectónico
Análisis e investigación del diseño arquitectónicoAnálisis e investigación del diseño arquitectónico
Análisis e investigación del diseño arquitectónico
 
Hacking.pdf
Hacking.pdfHacking.pdf
Hacking.pdf
 

Similar to Intruder adaptability

An effective architecture and algorithm for detecting worms with various scan...
An effective architecture and algorithm for detecting worms with various scan...An effective architecture and algorithm for detecting worms with various scan...
An effective architecture and algorithm for detecting worms with various scan...UltraUploader
 
X-ware: a proof of concept malware utilizing artificial intelligence
X-ware: a proof of concept malware utilizing artificial intelligenceX-ware: a proof of concept malware utilizing artificial intelligence
X-ware: a proof of concept malware utilizing artificial intelligenceIJECEIAES
 
1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docx
1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docx1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docx
1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docxdrennanmicah
 
Automated worm fingerprinting
Automated worm fingerprintingAutomated worm fingerprinting
Automated worm fingerprintingUltraUploader
 
@@@Rf8 polymorphic worm detection using structural infor (control flow gra...
@@@Rf8 polymorphic worm detection using structural infor    (control flow gra...@@@Rf8 polymorphic worm detection using structural infor    (control flow gra...
@@@Rf8 polymorphic worm detection using structural infor (control flow gra...zeinabmovasaghinia
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
 
A New Way of Identifying DOS Attack Using Multivariate Correlation Analysis
A New Way of Identifying DOS Attack Using Multivariate Correlation AnalysisA New Way of Identifying DOS Attack Using Multivariate Correlation Analysis
A New Way of Identifying DOS Attack Using Multivariate Correlation Analysisijceronline
 
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docxRESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docxinfantkimber
 
Robust encryption algorithm based sht in wireless sensor networks
Robust encryption algorithm based sht in wireless sensor networksRobust encryption algorithm based sht in wireless sensor networks
Robust encryption algorithm based sht in wireless sensor networksijdpsjournal
 
Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics IJNSA Journal
 
DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1IJITE
 
Intrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural NetworkIntrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural NetworkIOSR Journals
 
2011 modeling and detection of camouflaging worm
2011   modeling and detection of camouflaging worm2011   modeling and detection of camouflaging worm
2011 modeling and detection of camouflaging wormdeepikareddy123
 
Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques IJMER
 
A proposed architecture for network
A proposed architecture for networkA proposed architecture for network
A proposed architecture for networkIJCNCJournal
 
Internet Worm Classification and Detection using Data Mining Techniques
Internet Worm Classification and Detection using Data Mining TechniquesInternet Worm Classification and Detection using Data Mining Techniques
Internet Worm Classification and Detection using Data Mining Techniquesiosrjce
 
2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...
2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...
2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...Mrunalini Koritala
 
Modeling & automated containment of worms(synopsis)
Modeling & automated containment of worms(synopsis)Modeling & automated containment of worms(synopsis)
Modeling & automated containment of worms(synopsis)Mumbai Academisc
 

Similar to Intruder adaptability (20)

An effective architecture and algorithm for detecting worms with various scan...
An effective architecture and algorithm for detecting worms with various scan...An effective architecture and algorithm for detecting worms with various scan...
An effective architecture and algorithm for detecting worms with various scan...
 
X-ware: a proof of concept malware utilizing artificial intelligence
X-ware: a proof of concept malware utilizing artificial intelligenceX-ware: a proof of concept malware utilizing artificial intelligence
X-ware: a proof of concept malware utilizing artificial intelligence
 
1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docx
1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docx1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docx
1Running Head COMPUTER WORMS MALWARE IN CYBER SECURITY14COM.docx
 
Automated worm fingerprinting
Automated worm fingerprintingAutomated worm fingerprinting
Automated worm fingerprinting
 
@@@Rf8 polymorphic worm detection using structural infor (control flow gra...
@@@Rf8 polymorphic worm detection using structural infor    (control flow gra...@@@Rf8 polymorphic worm detection using structural infor    (control flow gra...
@@@Rf8 polymorphic worm detection using structural infor (control flow gra...
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
 
A New Way of Identifying DOS Attack Using Multivariate Correlation Analysis
A New Way of Identifying DOS Attack Using Multivariate Correlation AnalysisA New Way of Identifying DOS Attack Using Multivariate Correlation Analysis
A New Way of Identifying DOS Attack Using Multivariate Correlation Analysis
 
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docxRESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
RESPOND TO THIS DISCUSSION POST BASED ON THE TOPIC Compare and co.docx
 
Robust encryption algorithm based sht in wireless sensor networks
Robust encryption algorithm based sht in wireless sensor networksRobust encryption algorithm based sht in wireless sensor networks
Robust encryption algorithm based sht in wireless sensor networks
 
Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics Optimised Malware Detection in Digital Forensics
Optimised Malware Detection in Digital Forensics
 
DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1DB-OLS: An Approach for IDS1
DB-OLS: An Approach for IDS1
 
Intrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural NetworkIntrusion Detection Systems By Anamoly-Based Using Neural Network
Intrusion Detection Systems By Anamoly-Based Using Neural Network
 
2011 modeling and detection of camouflaging worm
2011   modeling and detection of camouflaging worm2011   modeling and detection of camouflaging worm
2011 modeling and detection of camouflaging worm
 
Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques Review of Intrusion and Anomaly Detection Techniques
Review of Intrusion and Anomaly Detection Techniques
 
A proposed architecture for network
A proposed architecture for networkA proposed architecture for network
A proposed architecture for network
 
Internet Worm Classification and Detection using Data Mining Techniques
Internet Worm Classification and Detection using Data Mining TechniquesInternet Worm Classification and Detection using Data Mining Techniques
Internet Worm Classification and Detection using Data Mining Techniques
 
L017317681
L017317681L017317681
L017317681
 
2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...
2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...
2011-A_Novel_Approach_to_Troubleshoot_Security_Attacks_in_Local_Area_Networks...
 
Modeling & automated containment of worms(synopsis)
Modeling & automated containment of worms(synopsis)Modeling & automated containment of worms(synopsis)
Modeling & automated containment of worms(synopsis)
 
G0262042047
G0262042047G0262042047
G0262042047
 

Intruder adaptability

  • 1. Tel Aviv University Raymond and Beverly Sackler Faculty of Exact Sciences Classification of Traffic Sources on The Internet A study of the adaptability level of network agents Thesis Submitted as partial fulfillment of the requirement towards the M.Sc. degree School of Computer Science Tel Aviv University By Uri Gilad This research work in this thesis has been conducted under the supervision of Prof. Hezy Yeshurun, Tel Aviv University (Coordinating Instructor) Dr. Vern Paxson, International Computer Science Institute May 2005.
  • 2. Abstract The subject of investigating the different types and purpose of network intruders has received relatively little attention in the academic community, unlike the extensively researched area of identifying intrusions. In this paper, we examine how adaptive to changing environments intruders are. We investigate the different groups that form when we rank intruders according to their adaptability. This research lends new understanding about the level of threat presented by highly adaptive intruders. The following paper is organized as follows: first we examine the current work in intrusion detection and related subjects. We establish ground truth by performing an experiment in controlled environment. We present and justify our methods of data collection and filtering, and move on to present results based on real data. Finally we outlay our conclusions and discuss methods of further research. The resulting algorithm is a useful tool to rank the intruders attacking a defended network according to adaptability levels. 2
  • 3. Contents Classification of Traffic Sources on The Internet...........................................................1 Abstract..............................................................................................................................2 Contents..............................................................................................................................3 1. Introduction....................................................................................................................4 2. Algorithm........................................................................................................................8 3. Results...........................................................................................................................31 4. Discussion.....................................................................................................................46 5. Further work................................................................................................................49 References.........................................................................................................................51 ‫75..................................................................................................................................תקציר‬ 3
  • 4. 1. Introduction Present day computer security technologies are focused on detecting and protecting against intruders. While there are many solutions for network defense, there has been surprisingly little research on the subject of understanding the nature of attackers. Present day Internet traffic contains a variety of threats, all of which may direct traffic at any random host connected to the network. The traffic originates from sources such as worms, attacking hackers, automated scanning tools and legitimate traffic. The rough classification above begs the question: Can one automatically categorize and determine the type of a traffic source. While it may seem secondary, the nature of an attacker is paramount to understanding the seriousness of a threat. Attacks that come from worms and scanning tools are “impersonal” and will not “insist” on lingering at the specific site while a human may return again and again. A human can employ a wider array of techniques and keep hammering at a site until finally brute- forcing a way inside. In addition, humans can be subject to retaliation by contacting the responsible party, while the threat from autonomous agents does not have a clear identity behind it to contact. 1.1 Traffic Sources on the Internet A random host connected to the Internet will be scanned and attacked in a short while [21]. Most of the attacks come from automated agents, due to their overwhelming numbers. Automated agents can be worms, human written software pieces. Worms are designed to propagate between computer systems by “infecting” a vulnerable machine using a built-in exploit. After infection the worm places a copy of itself on the targeted system and moves on to infect the next victim. The worm thus spawned on the infected system enters the same cycle. Research suggests that probes from worms and autorooters heavily dominate the Internet [37]. As mentioned above, another source of traffic on the Internet is automated scanning tools such as the abovementioned autorooters. Autorooters are designed to quickly inspect as many hosts as possible for information – without the propagation feature. There are benign scanning tools such as web crawlers – designed to collect information for indexes such as Internet search engines. Other scanning tools, however, are designed to seek vulnerabilities to exploit. The people behind the last type of automated scanning tools use the data gathered to return and break into a site found to be vulnerable. Autorooters, an evolution of these automated vulnerability scanners, actually install a backdoor automatically on a vulnerable machine and report the machines IP address back to the human operating them, who can now take control of the host. 4
  • 5. Rarely seen on the Internet are fully manual sources of hostile traffic, adaptive sources actually interacting with a specific network site in order to break into it. There is a distinction between this adaptive (possibly human) traffic sources and the above two rough classifications. While humans may employ worms and automated scanning tools, sometimes the human hacker will manually work against a site, with the intention of achieving better results by employing a wider range of techniques. This paper presents a practical method to classify sources of attack on the Internet. The classification is believed to be helpful when determining the nature of an attacker – is it an automated agent or a manual one (a human). The development and experimentation is based on actual data from several distinct sites. 1.2 Related Work The subjects of examining the behavior of an attacker on the Internet, understanding attackers nature or classifying the attacker by behavior have not received much attention from the academic community. However, the related subjects of differentiation between humans and computers and the subject of detecting worm outbreaks (a subset of the larger subject of intrusion detection) have been researched extensively. Several papers touch upon this work's ideas. Telling computers and humans apart is a problem discussed by Von Ahn et al. [1]. The researchers describe a technique dubbed "CAPTCHA" (Completely Automated Public Turing Test to Tell Computers and Humans Apart). As the acronym implies, the work details an automatic algorithm to generate a test that human beings can pass, while computer programs cannot. The test is based upon "hard to solve AI problems" that human beings can solve easily, while computers cannot. The Definition of "Hard to solve AI problems" relies on consensus among researchers, and therefore the problem set may change in the future. The proof that human beings can solve these problems is empiric. The work outlined in this paper and the CAPTCHA paper share an interest in telling humans and software agents apart. While the CAPTCHA solution is taken from the domain of AI, and relies on voluntary test takers (if the test is not passed, resources will be denied) this paper will attempt to present an algorithm that deals with more severe restrictions. The differentiation accomplished here is performed without knowledge or cooperation of the tested subject. In [7] Staniford et al. discuss theoretical advanced worm algorithms that use new propagation mechanisms, random and non-random. Discussed are worms equipped with a hit-list of targets to infect, worms sharing a common permutation to avoid repeating infection attempts and worms that study the application topology (for example: harvesting email addresses) to decide which computers to target for infection. Staniford and co. envision a “cyber center for disease control” to identify and combat threats. In this paper, we touch upon the task of identifying the threat and categorizing it as an adaptive source attack or an automated agent that may be part of a worldwide infection. The subject of “non-random” agents (such as those worms equipped with a hit-list – for 5
  • 6. example) is important, as the issue of identifying the origin of that attack (a human who may change tactics or a worm) is of importance. Zou et al. [2] present a generic worm monitoring system. The system described contains an ingress scan monitor, used for analyzing traffic to an unused IP address space. The researchers observe that worms follow a simple attack behavior while a hacker’s attack is more complicated, and cannot be simulated by known models. They suggest using a recursive filtering algorithm to detect new traffic trends that may indicate a case of worm infection. Zou’s observation that hacker and worm behavior is different is not expanded upon. The observation is implicitly used, as the paper describes a method for intruder classification based on their behavior. This work will present an algorithm that uses the mentioned difference to group intruders with similar behavior. In [3] Wu et al. expands upon the subject of worm detection mentioned in [2]. Wu discusses several worm scanning techniques and re-introduces the subject of monitoring unassigned IP address space. Jiang Wu and co. discuss several worm propagation algorithms similar to those presented in [7]. The authors discuss worm detection, and propose the hypothesis that random scanning worms scan the unassigned IP address space and that this fact may present a way to detect them. The authors search for common characteristics of worms (as opposed to other agents), one such common characteristic is that a worm will scan a large number of unassigned IP addresses in a short while. Wu and co. suggest an adaptive threshold as a way to detect worm outbreaks. Examining the algorithm results on traffic traces validates the work, and the conclusion is that unknown worms can be detected after only 4% of the vulnerable machines in the network are infected. Unused IP address monitoring is employed in this research also, this research draws upon the common characteristic of worms to detect “automaton-like” behavior, as opposed to more random or adaptive behavior. In [17] Jung et al. present an algorithm to separate port scanning from benign traffic by examining at the number of connections that are made to active hosts vs. the number of connections made to inactive hosts. Their observation is that there is a disparity on the number of connections made to active hosts between benign and hostile sources. The disparity in the access attempts to active and inactive hosts is implemented in an algorithm named TRW (Threshold random walk), which is used to successfully and efficiently detect scanning hosts. The research by Jung et al. brings to light an important difference between hostile scanners and benign users, which touches on this research. In this research an attempt is made to establish a range of behaviors, ranging from fully automated agents (such as worms) to attackers who react to environmental changes and modify their algorithm of attack accordingly – a behavior indicative of humans. The work done in [17] assumes certain behavior on the part of users – they will access more active sources than inactive sources. This research relies on the diversity of resources accessed as another characteristic useful to classify attackers. In [18] Weaver et al. show that even a simplified version of TRW achieves quite good results, which emphasizes the point that the difference in behavior between naïve and hostile sources is of importance. 6
  • 7. Spitzner [4] describes the honeypot: a system that has no production value and thus it can be assumed that any access to it is illegitimate. In [5] Spitzner makes the distinction between behaviors of different hackers and bases this distinction on their behavior – as can be evident by their actions when attacking a network. This work is related to honeypot technology as the working assumption that most traffic directed at unused IP space is hostile is used. Further, Spitzner notes the classification of Internet hostiles based on their activity, which is one of the foundations for this research. Spitzer, however, does not expand on the idea of classification beyond describing a specific group of Internet “operatives” dubbed “script kiddies”. The intention of this research is to go beyond that and provide a method of telling apart different hostile sources. Lee [6] presents a data mining approach to intrusion detection. The approach presented by Lee is to apply current machine learning techniques to network sessions, extracting features and analyzing them – finally building a decision tree to separate intrusion attempts from legitimate traffic. Building decision trees to aid in classification of traffic is a recognized method, but in this research we chose to rely on expert knowledge as this approach is assumed to produce better results. In Lees research the resulting detection rate of below 70% is claimed to be unsatisfactory. The low success rate in Lee’s work encouraged us to try the presented approach. 1.3 Rationale for this study In this paper, we present a practical algorithm to classify intruders according to their activity as captured at the targeted network. The classification defines a range between automated attacks and fully manual sources of attack such as human hackers. This kind of classification is touched upon in some works, but is not explored to the fullest extent. The algorithm is based on tested hypotheses on human and worm behavioral differences. The method relies on past behavior of simple worms in order to find the common denominator of the behavior of automated attackers. After determining this common denominator, attackers can be ranked and differentiated. 7
  • 8. 2. Algorithm 2.1 Establishing ground truth The research presented below is based on the assumption that there are core differences between the traffic generated by a fully manual source of attack such as a human and fully automated source of attack, such as a worm. This assumption was tested in an experiment before embarking on the development of a full-scale algorithm. 2.1.1 Experiment An initial algorithm was developed to test attacker behavior. The algorithm goes over a pre-recorded network traffic dump. For each traffic source, the algorithm calculates the series of ICMP echo requests and TCP/UDP ports the source accessed on each destination address. For each new target the algorithm processes for this source, the resulting access series is compared to those already performed by the source on different targets. If the access series is different, it is added to the list. It is proposed that a higher number of different access series for a specific source is indicative of the ability to react to changes, and possibly the guidance of a human. The basis for this assumption is that we believe a human will react to different hosts in a different way, producing a different access series, while automated sources will act upon a simple deterministic algorithm programmed beforehand, independent of the conditions in the network currently attacked. This algorithm is used to test the access series for humans and simple automatons/network worms. This is not the same as testing the behavior of an attacker to the fullest extent, including examining the data passed in the various sessions in the traffic collected. We believe that establishing that humans and worms differ in their “access series” to targets is a good argument towards proving that there is a difference in the behavior between an automaton and a more manual source. This conclusion is valid because a different access series is an example of different behavior. For the remainder of this paper, we term the number of different access series for a single source as an “adaptability score”. The method of calculating an access series for an intruder is given under section 2.3 – Algorithm details. 8
  • 9. 2.1.2 Experiment layout The experiment was composed of two stages. The first stage – the algorithms response to the activity of human attackers was tested, humans being an example of a highly adaptive agent. In the second phase the algorithms response to a network worm in a lab environment was tested. Network worms with their simple algorithms are examples of completely automatic agents. 2.1.2.1 Testing Human subjects A group of volunteers was collected for testing the assumptions about human behavior. Besides being human, the test subjects had to be knowledgeable security experts, with practical experience in penetration testing (i.e. breaking into company web sites for testing and improving their security). The human test subjects were given the task of breaking into a prepared site. Only the IP address range was provided, without mention of the purpose of the experiment and what the defense mechanisms are employed at the site. The site is protected by an ActiveScout [16] machine, which creates virtual resources for the humans to interact with. The test subjects are able to connect to interactive and transaction based services such as FTP, NetBIOS, HTTP and Telnet. The virtual site presented appears to contain some “security holes” in it in the form of vulnerable services and open ports. The vulnerable services show a welcome banner that clearly proclaims an aged version of a common server application, one that is known to the security community to be open for exploitation and can be abused to break into the hosting computer. Most of the versions reported in the welcome banners have publicly available exploits. The protection of the site is configured so that after scanning for ports, much more “virtual” resources than real resources will answer, and that accessing (as opposed to scanning) these virtual resources will cause the attacker to be locked out of the site for a short duration of time (4 hours). As we are using a commercial product to implement the experiment, the lockout period and conditions are configurable. Specifically, in this experiment lockout occurs under the following conditions: 1. If the intruder accesses a NetBIOS based resource, lockout will occur after trying a user/password combination, but enumerating servers, users and other NetBIOS resources does not trigger lockout. 2. If the intruder accesses an http server, the lockout will not be triggered. 3. After the intruder completes a handshake with a TCP based simulated resource – lockout will occur. These are the default settings on the product, which are claimed to provide maximum protection while ensuring a low number of false positives. 9
  • 10. A real web server was installed in the site to serve as the “trophy” to be found. This web server – a Linux machine running an old (unpatched) version of apache, was vulnerable to several well-known weaknesses. A hacker finding (and breaking into) this web server will find a message that will notify him of his success. The structure of the experiment would appear to introduce only minor bias. The experiment only approximates a “double blind” experiment, as the tester knows that the attackers are human. The test subjects are not aware which algorithms are being used, or what is being tested. The researcher is running an algorithm that is not aware of the type of test subject. Some bias may have been introduced when designing the test environment, however, as the site is obviously not a “real” one. There are no corporate assets in the site, there is no real content beyond that which is provided by the ActiveScout machine, and there are no legitimate uses for this site. These facts make the site markedly different than a commercial presence on the web. However, the site can pass for a bare bones website, and the attackers are employing the same techniques they use when penetrating “real” web sites. Thus, the experiment will record human attacker activity. 2.1.2.2 Testing automated agents - worms To test the assumptions about automated agent behavior, a representative group of current and past network worms was analyzed. Worms are an example of an automated agent, as a static pre-defined algorithm dictates a worm’s behavior. The worm experimentation is logistically simpler than running the experiment with humans, since most worms’ algorithms are well known. Analysis of each worms state machine produced enough knowledge about its expected access series. The purpose of testing worms is to understand how a very simple automaton behaves. Worms represent the simplest automatons available. 2.1.3 Experiment results, human subjects We had each participant write down a journal while conducting the test, containing the activities performed, the tools used and his conclusions. All human subject achieved scores higher or equal to 3 on the adaptability scale, meaning they had 3 or more different access series for the targets they accessed. While worms ordinarily had an adaptability score of “1” – as they perform according to a deterministic state machine. Below is a summary of the human experts results, some comments for each, and the resulting adaptability calculation. 10
  • 11. Test Subject #1 1. Test subject #1 started the experiment by researching the Internet Whois database and DNS to find out information about the site. 2. The test subject continued the experiment with port scans for various common ports. These include FTP, SSH, HTTP, NetBIOS, and some other ports – which include 6000/TCP (X-Windows) and the range 1-1024/TCP (looking for other open services). These scans were conducted on the entire range supplied, in deterministic, sequential order. 3. The test subject followed the above scans by manually connecting to SSH and HTTP open ports, and querying the services for versions and banner information. 4. The test subject attempted to browse the HTTP sites found, but was unable to due to the fact that the ActiveScout [16] had by this time determined it to be hostile, and locked him out of the site for the duration of 4 hours. 5. The test subject did not realize at first that he was locked out – and continued various attempts, finally giving up and returning in 8 hours, to accomplish basically the same results. 6. The test concluded after the second scan. 7. The test subject was able to determine which of the simulated computers was a “real” web server, but did not manage to break into it. The test subject arrived to this conclusion after noting the different content of the real web site and the simulated web sites. Following up on this hunch, the test subject mapped out the TCP/IP fingerprints for the simulated web site and the real web site, and found a bug in the implementation of the simulated web site. The simulated web site, while claiming to be the same OS model as the real web site had a different response to specific TCP packets. Discussion From the log this test subject kept, we learn that he attempted to use outside information (DNS, Whois) to learn about the site before accessing it, attempting to glean information about services and servers available before doing the actual scan. This is the kind of behavior rarely seen in automated agents to date, although possible, most automated agents do not tend to seek outside information about the site being attacked, an exception are topologically aware worms discussed below. The classification thesis proved robust against this intruder, as the test subject was attracted to open ports and attempted to penetrate them in various ways in order to break into the site. The test subject did not follow a specific plan once the attack passed the “reconnaissance” phase. This fact contributed to the adaptability and to a greater score in the algorithm. The access series for this test subjects appear below in figure 1. These port groups are counted towards an adaptability score of 5. 11
  • 12. 1. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 22/TCP 2. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 22/TCP,80/TCP 3. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 80/TCP 4. 22/TCP/TCP,21/TCP,80/TCP,135/TCP,137/UDP, 22/TCP,6000/TCP 5. 1-1024/TCP,6000/TCP Score = 5 Figure 1: test subject 1 port groups. Test subject #2 1. Test subject started the experiment by making a select few connections to FTP, HTTP, NetBIOS and ports 467/TCP and 484/TCP. It is interesting to note that ports 467/TCP and 484/TCP are not used by any known protocol. When later queried about this, the test subject replied that the ports were selected without a practical reason but as an attempt to confuse an operator of the site, if there is any. 2. These selective connections to specific ports suggest that the test subject made an earlier reconnaissance attempt from a different source address. The test subject confirmed this suspicion in a later interview. 3. After a break of four days, the test subject returned from the same source address, and performed a “ping sweep” of the available address space. 4. The test subject also performed scans to ports 98/TCP and 96/TCP. 5. Finally, the test subject concluded with a horizontal NetBIOS scan. Discussion In a later interview, the test subject confirmed that in the four days he did not actively attack the site, he contemplated the problem and deduced that a network device of some sort protects the site. The test subject did not divine the nature of the “network device” – but his later behavior is explained by this deduction. Apparently he realized that the device is reacting to scans. He attempted to access unusual ports in order to test the device’s response. Failing to find any weakness, and receiving what seemed like fake results from the NetBIOS scan, he gave up. The access series output by the algorithm is summarized in figure 2 below. 1. 137/UDP 80/TCP ICMP 2. 137/UDP 21/TCP ICMP 3. 98/TCP 96/TCP 4. 484/TCP 467/TCP Score = 4 Figure 2: Test subject 2 port groups. 12
  • 13. Test subject #3 1. Test subject 3 started the experiment by automatically mapping out the entire network range. 2. The test subject than followed up on several ports in random order, concentrating on 22/TCP (SSH) 80/TCP (HTTP) and 21/TCP (FTP). 3. The scan included attempts to determine the operating system of the attacked virtual computers by testing TCP flags – using an automated tool called “nmap” [20]. 4. After every connection attempt to a responsive “virtual host” the test subject would be blocked out of the network for a period of time. 5. This behavior of the network frustrated the test subject, and the experiment was concluded. Discussion While the experiment with this test subject itself did not provide a large amount of data to analyze, as the test subject gave up quickly on the test itself, the establishment of ground truth for the research benefited. The test subject, although not very thorough, did work in an adaptive way, returning to those ports that were responsive in the scanned hosts. The test subject spent more time on responsive resources, and tried to determine versions of applications and operating system. The test subject’s operating mode was completely adaptive, and summarized below are the different ports series output by the algorithm for this test subject. The score for this test subject is low, 3, and can be explained by his lack of interest and frustration. Although low, this score is still higher than the score awarded to automated agents, below. 1. 22/TCP 21/TCP 22/TCP 80/TCP 2. 21/TCP 22/TCP 21/TCP 80/TCP 3. 21/TCP 22/TCP 80/TCP Score = 3 Figure 3: Test subject #3 port groups. 13
  • 14. 2.1.4 Testing human subjects, conclusions Several conclusions arise from interviewing the test subjects and analyzing their actions. It appears that all test subjects became bored at one stage or another with the work. The fact that this was volunteer work, along with their growing suspicion that “something fishy” is going on, contributed to their wrapping up the experiment quicker than they would if their work was paid for or if their work was for personal interest – two common motivations for human hackers. This can lead to another indicator for human activity, but this research will not focus on it. All test subjects reported a suspicion that “something fishy” is going on. This feeling was caused by the fact that the ActiveScout protection mechanism used generated resources in response to scans, and blocked users for a period of time after they have proved to be hostile by accessing the generated resources. All test subjects eventually got blocked. The blocking was released automatically after a period of time. This behavior contributed to the frustration of the test subjects as the site did not seem to provide a consistent image of resources over time. Frustration is not a trait of automated agents. While the test subjects are familiar with mechanisms that block offensive users, they were not familiar with mechanisms that work by detecting access to “virtual” resources – but rather signature based mechanisms. Another conclusion is that hackers may not immediately follow the reconnaissance phase with an attack phase. Some hackers will take the results and analyze them manually, and than employ exploits against each vulnerable spot at their leisure. Additionally, these separate stages may come at the target from different source IP addresses, due to the use of DHCP or even due to an active attempt to disguise the origin of an attack. During the reconnaissance phase, when hackers employ automated scanning tools they are still considered to be automated agents. When hackers turn to selectively attacking scanned resources – they will appear to be manual sources of attack. The algorithm proved robust even if hackers perform the reconnaissance phase from a different source address as the source address they attack from. The reconnaissance source address will be detected (correctly) as an automated source, and the attack will be detected as a more adaptive source. Running several automated tools from the same source IP can cause enough adaptability to be counted as a manual source, which is as expected – if the attacker employs a fully automated script it will provide different results than if the attacker runs several scanning tools in response to results from the site. While human test subjects in this experiment employed automated tools for most of the reconnaissance phase, they performed the attack phase manually. This result is enhanced by the fact that the availability of scanning tools is much greater than the availability of automated “autorooter tools” – tools that perform an attack automatically. Autorooters perform the attack cycle from beginning to end, finally providing the user with a list of compromised hosts. Most of the “attack tools” available are usually tools which break into a single host, and the user is left to decide what to do after the break-in on his own. A reasonable assumption would be that users that write even more advanced 14
  • 15. attack tools – show even greater adaptability and randomness of actions when finally doing manual actions on their own. 2.1.5 Analysis of recent network worms as examples for automated agents To complete the process of establishing ground truth, we need to look at the other side of the spectrum, at fully automated sources of attack. The most common automated agent is the network worm. The following is a study of the recent well-known worms discovered on the Internet. We make the distinction between email worms and network worms. The difference between these two kinds of malicious agents is that email based worms usually require some sort of human interaction – such as opening the mail message and executing an attachment. The claim we need to establish is that network worms have a simple state machine, resulting in a predictable order of actions. A simple, predictable behavior will result in a low adaptability score. The worms studied are summarized below – for each worm, the adaptability score is calculated according to the worm’s algorithm. Included is the list of ICMP echo requests and TCP/UDP ports the worm accesses, this list is the basis for the adaptability score calculated. In some cases, a copy of network traffic from a captured host was not found, and we had to rely on public knowledge bases such as SANS [38] and CERT [39]. This knowledge was used to divine the worms’ algorithm, for which an adaptability score was calculated. 2.1.5.1 Sadmind First report by CERT: May 10, 2001 [22]. The Sadmind worm (also named PoisonBox.worm in some resources) is a worm that employs an exploit for a Solaris service (Sadmind) to spread. Once a machine is infected, in addition to propagating, the machine will scan for and deface IIS web servers. This is an example of a multimode worm, although the second mode is used for defacing and not propagation. Since the IIS infection is performed in conjunction with the propagation process, there are two possible port series for this worm. The worm will attempt the same series of ports for all victims, be they exploitable Solaris workstations or defaceable IIS machines. The port series calculated for this worm are: 1. 111/TCP, 600/TCP this is the propagation phase – the first port is the portmapper service used for exploitation, the exploit will open a backdoor on 600/TCP port. 2. 80/TCP – the port used to launch an exploit against the IIS machines, which will than download the defaced web pages from the attacking machine. Adaptability score for this worm would be 2, due to the additional feature of webpage defacement – a new series of actions attempted against hosts independently of the first series. 15
  • 16. 2.1.5.2 Code Red I & II The code red worm has several variants, but the security community makes a distinction between two major variants, sharing the use of the same vulnerability: Code Red first variant was reported by CERT on Jul. 19, 2001. [23] Code red II report by CERT: Aug. 6, 2001. Code Red connects to the HTTP port of any victim and sends a specially crafted request containing the worm itself, the worm code will execute from the stack of the exploited process. This worm’s state machine is simple, and consists of two stages: The generation of a random IP, and the probe/attack/spread phase, which is contained within the malicious payload. The list of ports contains only port 80/TCP – resulting in an adaptability score of 1. 2.1.5.3 Nimda First report by CERT: Sep. 18, 2001. [26] Nimda is a multi-mode worm, which spreads by either: 1. Infecting files in open NetBIOS shares. 2. Infecting web pages present on the attacked computer, thereby spreading to unsuspecting web clients. 3. Replication by sending itself in an executable attached in an email message from the infected computer. 4. Attacking IIS machines, using a weakness of the HTTP server. 5. Exploiting the backdoor left by code red II and Sadmind. The Nimda worm will gain a score of 2 with the algorithm, as the worm’s algorithm attacks the following ports groups: 1. 80/TCP [IIS & Code Red backdoor] 2. 445/TCP, 139/TCP [open NetBIOS shares] The web page infection, and the infecting email cannot be seen by our chosen IDS. Although this worm has several modes of attack, it’s algorithm is based of a simple state machine, even taking into consideration the modes not seen by our IDS, the worm will still retain a low score of 3. This worm does however merit a discussion of multi-mode worms – below. 2.1.5.4 Spida First report by CERT: May 22, 2002. [24] The Spida worm will connect to Microsoft SQL server and exploit a default “null” password for an administrative account in order to propagate and spread. The worm operates entirely over 1433/TCP, providing it with an adaptability score of 1. 16
  • 17. 2.1.5.5 Slapper First report by CERT: Sep. 14, 2002. [25] The Slapper worm will test systems for mod_ssl (the vulnerable web server module) by grabbing the web page banner from 80/TCP. If mod_ssl is present, the worm will connect to port 443/TCP, and launch an exploit. Due to the fact that this worm is Linux based, the worm downloads and recompiles it’s code on the infected system, as opposed to other worms that use static binaries. Since the worm operates in a pre-defined port order for all victims, (80/TCP, 443/TCP) the worm has an adaptability score of 1. 2.1.5.6 Slammer First report by CERT: Jan. 27, 2003. [27] The worm sends a specially crafted UDP packet to SQL services – which will cause the SQL server start sending the same packet to random IP addresses in a never ending loop. The worm’s state machine is extremely simple and is similar to the one employed by Code Red II above – generate an IP address and probe/attack/spread. This malware has Adaptability score of 1, where the port series includes simply 1434/UDP. 2.1.5.8 Blaster First report by CERT: Aug. 11, 2003 [28] Blaster attacks by exploiting a weakness in Microsoft DCOM RPC [41] services. The worm connects to port 135/TCP, and the compromised host is instructed to “call back” to the attacker and retrieve the worm code. The worm operates on this outgoing port only, gaining it a score of 1. 2.1.5.9 Welchia/Nachi First report by CERT: August 18, 2003 Shortly after blaster was released, a worm dubbed “Welchia”[40] was unleashed. This worm was especially interesting, as it seems it was written with the intent of being a “benign worm”. When Welchia successfully infects a host, it will first seek and remove any copies of the “Blaster” worm. Additionally, the worm will attempt to install the relevant patch from Microsoft. Welchia exploits two vulnerabilities in Microsoft systems: the RPC DCOM vulnerability used by blaster, and vulnerability in NTDLL commonly known as the “WebDAV” vulnerability. The worm will seek one of 76 hard- coded class B networks when attempting to infect with the WebDAV vulnerabilities. Presumably, the worm’s author scanned these networks beforehand, as the networks are owned by Chinese organizations, and the WebDAV exploit bundled with Welchia only works on some double-byte character platforms, Chinese being one of these vulnerable platforms. 17
  • 18. As Welchia targets different hosts with it’s two exploits, the worm will gain an adaptability score of 2. 2.1.5.10 Sasser First report by CERT: May 1, 2004 Reported by Symantec on April 30, 2004. The Sasser worm operates in way not dissimilar to Blaster. The worm will exploit an RPC service on the Microsoft Windows platform, the exploit will cause a remote command backdoor to be opened, through which the worm will instruct the victim to download and execute the malicious payload. The port series for this worm is 445/TCP, with 9996/TCP for the remote backdoor created. Some variants use ICMP ping request to speed up the process but all variants known follow a single predetermined port series for all victims, gaining the worm a score of 1. 2.1.5.11 Santy/PHP include worms. First report by CERT: December 21, 2004 The Santy worm is an example of a topological worm. This worm gathers a “hit list” of sources to attack from a Google query (later variants use other search engines) looking for a vulnerable version of a PHP (html scripting language used to present dynamic web pages) application, specifically phpBB – a bulletin board package. Since this worm will not perform any reconnaissance. It will bypass our chosen IDS system. Active Response Technology depends on intruders using baits distributed during an early reconnaissance phase. However, going over a captured traffic from this worm, shows that the worm will perform a simple series of actions, all over the HTTP port. The algorithm will award this worm an adaptability score of 1, as it does not divert from this order of actions, nor does it try any other services. 2.1.6 Multi mode worms, discussion As seen above, most worms have a very simple state machine and they tend to follow a very specific order of actions for each host attacked. However, there are worms that will attempt several exploits against a single target, examples from the above list are Nimda and Sadmind. Such worms will gain a higher adaptability score if they attempt a different set of attacks for each host. Multi mode worms will still linger a very short while with each victim, and they will still follow a pre-defined order of actions, although their algorithm will be more complex. Multi mode worms tend to have a lower number of variants, and are generally rare (although this trend may change in the future). 18
  • 19. 2.1.7 Topologically aware worms, discussion Topological worms are worms that use additional sources of information when deciding which victim to attack. An example is Santy (above). Such worms are not more or less adaptive than other worms, as the process of choosing victims is independent from the process of attacking each victim. These worms do present a problem with this research, our chosen IDS depends on an intruder (be it an adaptive source such as a human or an automatic source) doing some reconnaissance before attacking. If a worm is topologically aware, it may be able to ignore the virtual resources the Active Response technology sets as baits by using information gleaned elsewhere – such as a search engine. This problem is mitigated by the fact that looking at a traffic dump of a source infected by a topologically aware worm, an adaptability score can be determined manually. 19