This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 2 adopt such smart attack strategies could exhibit overall scan Furthermore, we demonstrate the effectiveness of our trafﬁc patterns different from those of traditional worms. Since spectrum-based detection scheme in comparison with existing the existing worm detection schemes will not be able to detect worm detection schemes. We deﬁne several new metrics. such scan trafﬁc patterns, it is very important to understand Maximal Infection Ratio (MIR) is the one to quantify the such smart-worms and develop new countermeasures to defend infection damage caused by a worm before being detected. against them. Other metrics include Detection Time (DT) and Detection In this paper, we conduct a systematic study on a new Rate (DR). Our evaluation data clearly demonstrate that our class of such smart-worms denoted as Camouﬂaging Worm (C- spectrum-based detection scheme achieves much better detec- Worm in short). The C-Worm has a self-propagating behavior tion performance against the C-Worm propagation compared similar to traditional worms, i.e., it intends to rapidly infect with existing detection schemes. Our evaluation also shows as many vulnerable computers as possible. However, the C- that our spectrum-based detection scheme is general enough Worm is quite different from traditional worms in which it to be used for effective detection of traditional worms as well. camouﬂages any noticeable trends in the number of infected The remainder of the paper is organized as follows. In computers over time. The camouﬂage is achieved by manip- Section 2, we introduce the background and review the related ulating the scan trafﬁc volume of worm-infected computers. work. In Section 3, we introduce the propagation model of Such a manipulation of the scan trafﬁc volume prevents exhibi- the C-Worm. We present our spectrum-based detection scheme tion of any exponentially increasing trends or even crossing of against the C-Worm in Section 4. The performance evaluation thresholds that are tracked by existing detection schemes , results of our spectrum-based detection scheme is provided in , . We note that the propagation controlling nature Section 5. We conclude this paper in Section 6. of the C-Worm (and similar smart-worms, such as “Atak”) cause a slow down in the propagation speed. However, by carefully controlling its scan rate, the C-Worm can: (a) still 2 BACKGROUND AND R ELATED WORK achieve its ultimate goal of infecting as many computers as 2.1 Active Worms possible before being detected, and (b) position itself to launch subsequent attacks , , , . Active worms are similar to biological viruses in terms of their We comprehensively analyze the propagation model of the infectous and self-propagating nature. They identify vulnerable C-Worm and corresponding scan trafﬁc in both time and computers, infect them and the worm-infected computers frequency domains. We observe that although the C-Worm propagate the infection further to other vulnerable computers. scan trafﬁc shows no noticeable trends in the time domain, In order to understand worm behavior, we ﬁrst need to model it demonstrates a distinct pattern in the frequency domain. it. With this understanding, effective detection and defense Speciﬁcally, there is an obvious concentration within a narrow schemes could be developed to mitigate the impact of the range of frequencies. This concentration within a narrow worms. For this reason, tremendous research effort has focused range of frequencies is inevitable since the C-Worm adapts on this area , , , , . to the dynamics of the Internet in a recurring manner for Active worms use various scan mechanisms to propagate manipulating and controlling its overall scan trafﬁc volume. themselves efﬁciently. The basic form of active worms can be The above recurring manipulations involve steady increase, categorized as having the Pure Random Scan (PRS) nature. In followed by a decrease in the scan trafﬁc volume, such that the PRS form, a worm-infected computer continuously scans the changes do not manifest as any trends in the time domain a set of random Internet IP addresses to ﬁnd new vulnerable or such that the scan trafﬁc volume does not cross thresholds computers. Other worms propagate themselves more effec- that could reveal the C-Worm propagation. tively than PRS worms using various methods, e.g., network Based on the above observation, we adopt frequency domain port scanning, email, ﬁle sharing, Peer-to-Peer (P2P) networks, analysis techniques and develop a detection scheme against and Instant Messaging (IM) , . In addition, worms use wide-spreading of the C-Worm. Particularly, we develop a different scan strategies during different stages of propagation. novel spectrum-based detection scheme that uses the Power In order to increase propagation efﬁciency, they use a local Spectral Density (PSD) distribution of scan trafﬁc volume in network or hitlist to infect previously identiﬁed vulnerable the frequency domain and its corresponding Spectral Flatness computers at the initial stage of propagation , . They Measure (SFM) to distinguish the C-Worm trafﬁc from non- may also use DNS, network topology and routing information worm trafﬁc (background trafﬁc). Our frequency domain anal- to identify active computers instead of randomly scanning IP ysis studies use the real-world Internet trafﬁc traces (Shield addresses , , , . They split the target IP address logs dataset) provided by SANs Internet Storm Center (ISC) space during propagation in order to avoid duplicate scans , 2 . Our results reveal that non-worm trafﬁc (e.g., . Li et al.  studied a divide-conquer scanning technique port-scan trafﬁc for port 80, 135 and 8080) has relatively that could potentially spread faster and stealthier than a larger SFM values for their PSD distributions. Whereas, the traditional random-scanning worm. Ha et al.  formulated C-Worm trafﬁc shows comparatively smaller SFM value for the problem of ﬁnding a fast and resilient propagation topology its respective PSD distribution. and propagation schedule for Flash worms. Yang et al.  studied the worm propagation over the sensor networks. 2. ISC monitors and collects port-scan trafﬁc data from around 1 million IP addresses spanning several thousands of organizations in different geograph- Different from the above worms, which attempt to accelerate ical regions. the propagation with new scan schemes, the Camouﬂaging Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 3 Worm (C-Worm) studied in this paper aims to elude the de- rules for detecting the worm propagation. For example, tection by the worm defense system during worm propagation. Venkataraman et al. and Wu et al. in ,  proposed Closely related, but orthogonal to our work, are the evolved schemes to examine statistics of scan trafﬁc volume, Zou et active worms that are polymorphic ,  in nature. Poly- al. presented a trend-based detection scheme to examine the morphic worms are able to change their binary representation exponential increase pattern of scan trafﬁc , Lakhina et al. or signature as part of their propagation process. This can in  proposed schemes to examine other features of scan be achieved with self-encryption mechanisms or semantics- trafﬁc, such as the distribution of destination addresses. Other preserving code manipulation techniques. The C-Worm also works study worms that attempt to take on new patterns to shares some similarity with stealthy port-scan attacks. Such avoid detection . attacks try to ﬁnd out available services in a target system, Besides the above detection schemes that are based on while avoiding detection , . It is accomplished by the global scan trafﬁc monitor by detecting trafﬁc anoma- decreasing the port scan rate, hiding the origin of attackers, lous behavior, there are other worm detection and defense etc. Due to the nature of self-propagation, the C-Worm must schemes such as sequential hypothesis testing for detecting use more complex mechanisms to manipulate the scan trafﬁc worm-infected computers , payload-based worm signature volume over time in order to avoid detection. detection , . In addition, Cai et al. in  presented both theoretical modeling and experimental results on a col- laborative worm signature generation system that employs 2.2 Worm Detection distributed ﬁngerprint ﬁltering and aggregation and multiple Worm detection has been intensively studied in the past and edge networks. Dantu et al. in  presented a state-space can be generally classiﬁed into two categories: “host-based” feedback control model that detects and control the spread detection and “network-based” detection. Host-based detection of these viruses or worms by measuring the velocity of the systems detect worms by monitoring, collecting, and analyzing number of new connections an infected computer makes. worm behaviors on end-hosts. Since worms are malicious Despite the different approaches described above, we believe programs that execute on these computers, analyzing the that detecting widely scanning anomaly behavior continues to behavior of worm executables plays an important role in host- be a useful weapon against worms, and that in practice multi- based detection systems. Many detection schemes fall under faceted defence has advantages. this category , . In contrast, network-based detection systems detect worms primarily by monitoring, collecting, 3 M ODELING OF THE C-WORM and analyzing the scan trafﬁc (messages to identify vulner- able computers) generated by worm attacks. Many detection 3.1 C-Worm schemes fall under this category , , , , . The C-Worm camouﬂages its propagation by controlling scan Ideally, security vulnerabilities must be prevented to begin trafﬁc volume during its propagation. The simplest way to with, a problem which must addressed by the programming manipulate scan trafﬁc volume is to randomly change the language community. However, while vulnerabilities exist and number of worm instances conducting port-scans. pose threats of large-scale damage, it is critical to also focus As other alternatives, a worm attacker may use an open-loop on network-based detection, as this paper does, to detect wide- control (non-feedback) mechanism by choosing a randomized spreading worms. and time related pattern for the scanning and infection in order In order to rapidly and accurately detect Internet-wide to avoid being detected. Nevertheless, the open-loop control large scale propagation of active worms, it is imperative to approach raises some issues of the invisibility of the attack. monitor and analyze the trafﬁc in multiple locations over First, as we know, worm propagation over the Internet can the Internet to detect suspicious trafﬁc generated by worms. be considered a dynamic system. When an attacker launches The widely adopted worm detection framework consists of worm propagation, it is vey challenging for the attacker to multiple distributed monitors and a worm detection center know the accurate parameters for worm propagation dynamics that controls the former , . This framework is well over the Internet. Given the inaccurate knowledge of worm adopted and similar to other existing worm detection systems, propagation over the Internet, the open-loop control system such as the Cyber center for disease controller , Internet will not be able to stabilize the scan trafﬁc. This is a known motion sensor , SANS ISC (Internet Storm Center) , result from control system theory . Consequently, the Internet sink , and network telescope . The monitors overall worm scan trafﬁc volume in the open-loop control are distributed across the Internet and can be deployed at end- system will expose a much higher probability to show an hosts, router, or ﬁrewalls etc. Each monitor passively records increasing trend with the progress of worm propagation. As irregular port-scan trafﬁc, such as connection attempts to a more and more computers get infected, they, in turn, take range of void IP addresses (IP addresses not being used) and part in scanning other computers. Hence, we consider the C- restricted service ports. Periodically, the monitors send trafﬁc worm as a worst case attacking scenario that uses a closed- logs to the detection center. The detection center analyzes the loop control for regulating the propagation speed based on the trafﬁc logs and determines whether or not there are suspicious feedback propagation status. scans to restricted ports or to invalid IP addresses. In order to effectively evade detection, the overall scan Network-based detection schemes commonly analyze the trafﬁc for the C-Worm should be comparatively slow and collected scanning trafﬁc data by applying certain decision variant enough to not show any notable increasing trends over Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 4 time. On the other hand, a very slow propagation of the C- B has been infected. Through validating such marks during Worm is also not desirable, since it delays rapid infection the propagation, a C-Worm infected computer can estimate damage to the Internet. Hence, the C-Worm needs to adjust its M (t). Appendix A discusses one alternative how the C- propagation so that it is neither too fast to be easily detected, ¯ Worm could estimate M (t) to obtain M (t) as the propagation nor too slow to delay rapid damage on the Internet. proceeds. There are other approaches to achieve this goal, such To regulate the C-Worm scan trafﬁc volume, we introduce as incorporating the Peer-to-Peer techniques to disseminate a control parameter called attack probability P (t) for each information through secured IRC channels , . worm-infected computer. P (t) is the probability that a C- Worm instance participates in the worm propagation (i.e., 3.2 Propagation Model of the C-Worm scans and infects other computers) at time t. Our C-Worm model with the control parameter P (t) is generic. P (t) = 1 To analyze the C-Worm, we adopt the epidemic dynamic represents the cases for traditional worms, where all worm model for disease propagation, which has been extensively instances actively participate in the propagation. For the C- used for worm propagation modeling , . Based on Worm, P (t) needs not be a constant value and can be set as existing results , , this model matches the dynamics a time varying function. of real worm propagation over the Internet quite well. For this In order to achieve its camouﬂaging behavior, the C-Worm reason, similar to other publications, we adopt this model in needs to obtain an appropriate P (t) to manipulate its scan our paper as well. Since our investigated C-Worm is a novel trafﬁc. Speciﬁcally, the C-Worm will regulate its overall scan attack, we modiﬁed the original Epidemic dynamic formula trafﬁc volume such that: (a) it is similar to non-worm scan to model the propagation of the C-Worm by introducing the trafﬁc in terms of the scan trafﬁc volume over time, (b) it P (t) - the attack probability that a worm-infected computer does not exhibit any notable trends, such as an exponentially participates in worm propagation at time t. We note that there increasing pattern or any mono-increasing pattern even when is a wide scope to notably improve our modiﬁed model in the number of infected hosts increases (exponentially) over the future to reﬂect several characteristics that are relevant in time, and (c) the average value of the overall scan trafﬁc real-world practice. volume is sufﬁcient to make the C-Worm propagate fast Particularly, the epidemic dynamic model assumes that any enough to cause rapid damage on the Internet3. given computer is in one of the following states: immune, We assume that a worm attacker intends to manipulate vulnerable, or infected. An immune computer is one that scan trafﬁc volume so that the number of worm instances cannot be infected by a worm; a vulnerable computer is one participating in the worm propagation follow a random distri- that has the potential of being infected by a worm; an infected ¯ ¯ bution with mean MC . This MC can be regulated in a random computer is one that has been infected by a worm. The simple fashion during worm propagation in order to camouﬂage the epidemic model for a ﬁnite population of traditional PRS propagation of C-Worm. Correspondingly, the worm instances worms can be expressed as4 , need to adjust their attack probability P (t) in order to ensure dM (t) = β · M (t) · [N − M (t)], (1) that the total number of worm instances launching the scans dt ¯ is approximately MC . where M (t) is the number of infected computers at time t; ¯ To regulate MC , it is obvious that P (t) must be decreased N (= T · P1 · P2 ) is the number of vulnerable computers on over time since M (t) keeps increasing during the worm the Internet; T is the total number of IP addresses on the propagation. We can express P (t) using a simple function Internet; P1 is the ratio of the total number of computers on the ¯ ¯ as follows: P (t) = min( M(t) , 1), where M (t) represents the MC ¯ Internet over T ; P2 is the ratio of total number of vulnerable estimation of M (t) at time t. From the above expression, we computers on the Internet over the total number of computers ¯ know that the C-Worm needs to obtain the value of M (t) (as on the Internet; β = S/V is called the pairwise infection rate close to M (t) as possible) in order to generate an effective ; S is the scan rate deﬁned as the number of scans that P (t). Here, we discuss one approach for the C-Worm to an infected computer can launch in a given time interval. We estimate M (t). The basic idea is as follows: A C-Worm could assume that at t = 0, there are M (0) computers being initially estimate the percentage of computers that have already been infected and N − M (0) computers being susceptible to further infected over the total number of IP addresses as well as M (t), worm infection. through checking a scan attempt as a new hit (i.e., hitting The C-Worm has a different propagation model compared an uninfected vulnerable computer) or a duplicate hit (i.e., to traditional PRS worms because of its P (t) parameter. hitting an already infected vulnerable computer). This method Consequently, Formula (1) needs to be rewritten as, requires each worm instance (i.e., infected computer) to be dM (t) marked indicating that this computer has been infected. Thus, = β · M (t) · P (t) · [N − M (t)]. (2) when a worm instance (for example, computer A) scans one dt ¯ ¯ infected computer (for example, computer B), then computer A Recall that P (t) = M (t) , M (t) is the estimation of M (t) at MC ¯ will detect such a mark, thereby becoming aware that computer ¯ time t, and assuming that M (t) = (1 + ) · M (t), where is 3. Note that if chooses P (t) below a certain (very low) level, other 4. We would like to remark that we use the PRS worms to compare C- human-scale countermeasures (e.g., signature-based virus detection, machine Worm performance, but our work can be easily extended to compare with quarantine) may become effective to disrupt the propagation. other worm scan techniques, such as hitlist. Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 5 Num ber of Detected Scanning Hosts on Cam ouflaging Worm the estimation error, the Formula (2) can be rewritten as, 100 ¯ # of Detected Scanning Hosts dM (t) β · MC 90 = · [N − M (t)]. (3) 80 dt 1 + (t) 70 60 50 With Formula (3), we can derive the propagation model 40 ¯ β·MC for the C-Worm as M (t) = N − e− 1+ (t)·t (N − M (0)), 30 20 where M (0) is the number of infected computers at time 10 0 0. Assume that the worm detection system can monitor Pm 200 2000 4000 6000 8000 9000 10000 11000 12000 13000 14000 (Pm ∈ [0, 1]) of the whole Internet IP address space. Without Time (min) PRS C-Worm 1 C-Worm 2 C-Worm 3 loss of generality, the probability that at least one scan from a worm-infected computer (it generates S scans in unit time Fig. 1. Observed infected instance number for the C- on average) will be observed by the detection system is Worm and PRS worm 1 − (1 − Pm )P (t)·S . We deﬁne that MA (t) is the number of Infection Ratio on Camouflaging Worms worm instances that have been observed by the worm detection 1 system at time t, then there are M (t) − MA (t) unobserved 0.9 0.8 infected instances at time t. At the worm propagation early 0.7 stage, M (t) − MA (t) M (t). The expected number of newly 0.6 IR 0.5 observed infected instances at t + δ (where δ is the interval 0.4 0.3 of monitoring) is (M (t) − MA (t)) · [1 − (1 − Pm )P (t)·S ] 0.2 0.1 M (i)[1 − (1 − Pm )P (t)·S ]. Thus, we have MA (t + δ) = 0 200 2000 4000 5000 7000 8100 9100 10200 11800 14000 MA (t)+M (t)[1−(1−Pm )P (t)·S ]. Using simple mathematical Time (min) manipulations, the number of worm instances observed by the PRS C-Worm 1 C-Worm 2 C-Worm 3 worm detection system at time t is, Fig. 2. Infected ratio for the C-Worm and PRS-Worm Pm · MC¯ MA (t) = P (t) · M (t) · Pm = . (4) 1 + (t) For the C-Worm, the trend of observed number of worm instances over time (MA (t)) (deﬁned in Formula (4)) is much 3.3 Effectiveness of the C-Worm different from that of the traditional PRS worm as shown in Fig. 2. This clearly demonstrates how the C-Worm success- We now demonstrate the effectiveness of the C-Worm in evad- fully camouﬂages its increase in the number of worm instances ing worm detection through controlling P (t). Given random (MA (t)) and avoids detection by worm detection systems ¯ selection of Mc , we generate three C-Worm attacks (viz., C- that expect exponential increases in worm instance numbers Worm 1, C-Worm 2 and C-Worm 3) that are characterized during large-scale worm propagation. Fig. 3 shows the number by different selections of mean and variance magnitudes of scanning computers from normal non-worm port-scanning ¯ for MC . In our simulations, we assume that the scan rate trafﬁc (background trafﬁc) for several well-known ports, (i.e., of the traditional PRS worm follow a normal distribution 25, 53, 135, and 8080) obtained over several months by the Sn = N (40, 40) (note that if the scan rate generated by above ISC. Comparing Fig. 3 with Fig. 1, we can observe that it is distribution is less than 0 , we set the scan rate as 0). We also hard to distinguish the C-Worm port trafﬁc from background set the total number of vulnerable computers on the Internet port-scanning trafﬁc in the time domain. as 360,000, which is the total number of infected computers From above Figs. 1 and 2, we also observe that the C- in “Code-Red” worm incident . Worm is still able to maintain a certain magnitude of scan Fig. 1 shows the observed number of worm-infected com- trafﬁc so as to cause signiﬁcant infection on the Internet. As puters over time for the PRS worm and the above three C- a note regarding the speed of C-Worm propagation, we can Worm attacks. Fig. 2 shows the infection ratio for the PRS observe from Fig. 1 that the C-Worm takes approximately 10 worm and the above three C-Worm attacks. These simulations days to infect 75% of total vulnerable hosts in comparison are for a worm detection system discussed in Section 2.2 that with the 3.3 days taken by a PRS worm5 . Hence, the C-Worm covers a 220 IPv4 address space on the Internet. The reason for could potentially adjust its propagation speed such that it is choosing 220 IP addresses as the coverage space of the worm still effective in causing wide-spreading propagation, while detection system is due to the fact that the SANs Internet avoiding being detected by the worm detection schemes. Storm Center (ISC), a representative ITM system, has similar We discussed the “Atak” worm in Section I and mentioned coverage space . In the ITM systems, a large number of that it is similar to the C-Worm since it tries to avoid being monitors are commonly deployed all over the Internet and detected, when it suspects that it is being detected by anti- each monitor collects the trafﬁc directed to a small set of IP worm software. However, it differs from the C-Worm in its address spaces which are not commonly used (also called dark behavior. The “Atak” worm attempts to hide only during IP addresses). Therefore, the address space of ITM system is times it suspects its propagation will be detected by anti-worm not a narrow range address space, rather a large number of small chunks of addresses randomly spread across the global 5. Our simulated PRS worm has less scan rate (mean value of 40) than IP address space. “Code-Red” (mean value of 358). Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 6 Scanning Traffic Volume: port 25 Scanning Traffic Volume: port 53 2500 500 control transfers into signals (traps) and inserting dummy Number of Scans 2000 400 control transfers and “junk” instructions after the signals. Number of Scans 1500 300 The resulting code can signiﬁcantly reduce the chance to be 1000 200 detected. Recent studies also showed that existing commercial 500 100 anti-worm detection systems fail to detect brand new worms 0 0 500 1000 1500 2000 2500 0 0 500 1000 1500 2000 2500 and can also be easily circumvented by worms that use simple Time (unit − 20 min) Time (unit − 20 min) Scanning Traffic Volume: port 135 Scanning Traffic Volume: port 8080 mutation techniques to manipulate their payload . 3500 3000 3000 Although in this paper we only demonstrate effectiveness 2500 of the C-Worm against existing trafﬁc volume-based detection Number of Scans) Number of Scans 2500 2000 2000 1500 schemes, the design principle of the C-Worm can be extended 1500 1000 1000 to defeat other newly developed detection schemes, such 500 500 as destination distribution-based detection , . In the 0 0 0 500 1000 1500 Time (unit − 20 min) 2000 2500 0 500 1000 1500 Time (unit − 20 min) 2000 2500 following, we discuss this preliminary concept. Recall that the Fig. 3. Observed infected instance number for back- attack target distribution based schemes analyze the distribu- ground scanning reported by ISC tion of attack targets (the scanned destination IP addresses) as basic detection data to capture the fundamental features of worm propagation, i.e., they continuously scan different software. Whereas, the C-Worm proactively camouﬂages itself targets, which is not the expected behavior of non-worm at all times. In addition, the “Self-stopping” worm attempts to scan trafﬁc. However, our initial investigation shows that the hide by co-ordinating with its members to halt propagation worm attacker is still able to defeat such a countermeasure activity only after the vulnerable population is subverted . via manipulating the attack target distribution. For example, This behavior leaves enough evidence for worm detection the attacker may launch a portion of scan trafﬁc bound for systems to recognize its propagation. The C-Worm, on the some IP addresses monitored by ITM system. Recall that other hand, hides itself even during its propagation and thus those dedicated IP addresses monitored by ITM system can keeps the worm detection schemes completely unaware of its be obtained via probing attacks or other means , , propagation. The C-Worm also has some similarity in spirit . with polymorphic worms that manipulate the byte stream of Using port 135 reported by SANs ISC as an example, we worm payload in order to avoid the detection of signature analyze the traces and obtain the trafﬁc target distribution in a (payload)-based detection scheme , . The manipulation window lasting 10 mins. Following existing work , , of worm payload can be achieved by various mechanisms: (a) we use entropy as the metric to measure the attack target interleaving meaningful instructions with NOP (no operation), distribution. Fig. 4 shows the Probability Density Function (b) using different instructions to achieve the same results, (c) (PDF) of background trafﬁc’s entropy values. We also simulate shufﬂing the register set in each worm propagation program the worm propagation trafﬁc, which allocates a portion of code copy, and (d) using cryptography mechanisms to change scan trafﬁc bound for IP addresses monitored by the ITM worm payload signature with every infection attempt , system. Following this, we obtain the PDF of the entropy . In contrast, the C-Worm tries to manipulate the scan value for combined trafﬁc including both worm propagation trafﬁc pattern to avoid detection. and background trafﬁc. From Fig. 4, we know that when the attacker uses a portion of attack trafﬁc to manipulate the target distribution, the entropy-based detection scheme 3.4 Discussion can degrade signiﬁcantly. For example, when the attacker In this paper, we focus on a new class of worms, referred to as uses 10% trafﬁc to manipulate the trafﬁc’s entropy value, the the camouﬂaging worm (C-Worm). The C-Worm adapts their false positive rate of entropy-based detection scheme is 14%. propagation trafﬁc patterns in order to reduce the probability When the attacker uses 30% trafﬁc to manipulate the trafﬁc’s of detection, and to eventually infect more computers. The entropy value, the false positive rate becomes 40%. Hence, C-Worm is different from polymorphic worms that delib- in order to preserve the performance, entropy-based detection erately change their payload signatures during propagation scheme needs to evolve correspondingly and integrate with , . For example, MetaPHOR  and Zmist  other detection schemes. We will perform a more detailed worms intensively metamorphose their payload signature to study of this aspect in our future work. hide themselves from detection schemes that rely on expensive packet payload analysis. Bethencourt et al.  studied the 4 D ETECTING THE C-WORM worm which employs private information retrieval techniques 4.1 Design Rationale to ﬁnd and retrieve speciﬁc pieces of sensitive information In this section, we develop a novel spectrum-based detection from compromised computers while hiding its search criteria. scheme. Recall that the C-Worm goes undetected by detection Sharif et al.  presented an obfuscation-based technique that schemes that try to determine the worm propagation only in automatically conceals speciﬁc condition dependent malicious the time domain. Our detection scheme captures the distinct behavior from virus detectors that have no prior knowledge of pattern of the C-Worm in the frequency domain, and thereby program inputs. Popov et al.  investigated a technique that has the potential of effectively detecting the C-Worm propa- allows the worm programs to be obfuscated by changing many gation. Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 7 PDF of C-Worm SFM 100 90 80 Probability Density 70 60 50 40 30 20 10 0 0 0.04 0.08 0.11 0.15 0.19 0.23 0.27 0.3 0.34 0.38 0.42 0.48 0.72 0.96 SFM Value Fig. 5. PDF of SFM on C-Worm trafﬁc Fig. 4. Manipulation of attack target distribution entropy PDF of Norm al Non-worm Scanning Traffic 70 In order to identify the C-Worm propagation in the fre- 60 quency domain, we use the distribution of Power Spectral Den- 50 Probability Density sity (PSD) and its corresponding Spectral Flatness Measure 40 (SFM) of the scan trafﬁc. Particularly, PSD describes how the 30 power of a time series is distributed in the frequency domain. 20 10 Mathematically, it is deﬁned as the Fourier transform of the 0 auto-correlation of a time series. In our case, the time series 0 0.15 0.3 0.4 0.43 0.47 0.5 0.54 0.57 0.61 0.64 0.68 0.71 0.75 0.8 0.96 corresponds to the changes in the number of worm instances SFM Value that actively conduct scans over time. The SFM of PSD is Fig. 6. PDF of SFM on normal non-worm trafﬁc deﬁned as the ratio of geometric mean to arithmetic mean of the coefﬁcients of PSD. The range of SFM values is [0, 1] and a larger SFM value implies ﬂatter PSD distribution and followed by a decrease in the scan trafﬁc volume. vice versa. Notice that the frequency domain analysis will require To illustrate SFM values of both the C-Worm and normal more samples in comparison with the time domain analysis, non-worm scan trafﬁc, we plot the Probability Density Func- since the frequency domain analysis technique such as the tion (PDF) of SFM for both C-Worm and normal non-worm Fourier transform, needs to derive power spectrum amplitude scan trafﬁc as shown in Fig. 5 and Fig. 6, respectively. The for different frequencies. In order to generate the accurate normal non-worm scan trafﬁc data shown in Fig. 6 is based spectrum amplitude for relatively high frequencies, a high on real-world traces collected by the ISC 6 . Note that we granularity of data sampling will be required. In our case, we only show the data for port 8080 as an example, and other rely on Internet threat monitoring (ITM) systems to collect ports show similar observations. From this ﬁgure, we know trafﬁc traces from monitors (motion sensors) in a timely that the SFM value for normal non-worm trafﬁc is very small manner. As a matter of fact, other existing detection schemes (e.g., SFM ∈ (0.02, 0.04) has much higher density compared based on the scan trafﬁc rate , variance  or trend  with other magnitudes). The C-Worm data shown in Fig. 5 is will also demand a high sampling frequency for ITM systems based on 800 C-Worms attacks generated by varying attack in order to accurately detect worm attacks. Enabling the ITM parameters deﬁned in Section 3 such as P (t) and Mc (t). system with timely data collection will beneﬁt worm detection From this ﬁgure, we know that the SFM value of the C-Worm in real-time. attacks is high (e.g., SFM ∈ 0.5, 0.6 has high density). From 4.2 Spectrum-based Detection Scheme the above two ﬁgures, we can observe that there is a clear demarcation range of SFM ∈ (0.3, 0.38) between the C-Worm We now present the details of our spectrum-based detection and normal non-worm scan trafﬁc. As such, the SFM can be scheme. Similar to other detection schemes , , we use used to sensitively detect the C-Worm scan trafﬁc. a “destination count” as the number of the unique destination IP addresses targeted by launched scans during worm propaga- The large SFM values of normal non-worm scan trafﬁc tion. To understand how the destination count data is obtained, can be explained as follows. The normal non-worm scan we recall that an ITM system collects logs from distributed trafﬁc does not tend to concentrate at any particular frequency monitors across the Internet. On a side note, Internet Threat since its random dynamics is not caused by any recurring Monitoring (ITM) systems are a widely deployed facility to phenomenon. The small value of SFM can be reasoned by detect, analyze, and characterize dangerous Internet threats the fact that the power of C-Worm scan trafﬁc is within a such as worms. In general, an ITM system consists of one narrow-band frequency range. Such concentration within a centralized data center and a number of monitors distributed narrow range of frequencies is unavoidable since the C-Worm across the Internet. Each monitor records trafﬁc that addressed adapts to the dynamics of the Internet in a recurring manner to a range of IP addresses (which are not commonly used IP for manipulating the overall scan trafﬁc volume. In reality, address also called the dark IP addresses) and periodically the above recurring manipulations involve steady increase sends the trafﬁc logs to the data center. The data center then 6. The traces used in this paper contain log ﬁles which have over 100 analyzes the collected trafﬁc LOGS and publishes reports (e.g., million records and the total size exceeds 40 GB. statistics of monitored trafﬁc) to ITM system users. Therefore Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 8 the baseline trafﬁc in our study is scan trafﬁc. With reports in expressed as, a sampling window Ws , the source count X(t) is obtained by n 1 [ k=1 S(fk )] n counting the unique source IP addresses in received logs. SF M = 1 n , (7) To conduct spectrum analysis, we consider a detection n k=1 S(fk ) sliding window Wd in the worm detection system. Wd consists where S(fk ) is an PSD coefﬁcient for the PSD obtained from of q (> 1) continuous detection sampling windows and each the results in Formula (6). SFM is a widely existing measure sampling window lasts Ws . The detection sampling window for discriminating frequencies in various applications such is the unit time interval to sample the detection data (e.g., the as voiced frame detection in speech recognition , . destination count). Hence, at time i, within a sliding window In general, small values of SFM imply the concentration of Wd , there are q samples denoted by (X(i − q − 1), X(i − data at narrow frequency spectrum ranges. Note that the C- q − 2), . . . , X(i)), where X(i − j − 1) (j ∈ (1, q)) is the j-th Worm has unpreventable recurring behavior in its scan trafﬁc; destination count from time i − j − 1 to i − j. consequently its SFM values are comparatively smaller than In our spectrum-based detection scheme, the distribution of the SFM values of normal non-worm scan trafﬁc. To be useful PSD and its corresponding SFM are used to distinguish the C- in detecting C-Worms, we introduce a sliding window to Worm scan trafﬁc from the non-worm scan trafﬁc. Recall that capture a noticeably higher concentrations at a small range of the deﬁnition of PSD distribution and its corresponding SFM spectrum. When such noticeably concentration is recognized, are introduced in Section 4.1. In our worm detection scheme, we derive the SFM within a wider frequency range. From the detection data (e.g., destination counter), is further pro- Fig. 5, we can observe that the SFM value for the C-Worm is cessed in order to obtain its PSD and SFM. In the following, very small (e.g., with a mean value of approximately 0.075). we detail how the PSD and SFM are determined during the A formal analysis of SFM for the C-Worm is presented in the processing of the detection data. Appendix B. 4.2.1 Power Spectral Density (PSD) 4.2.3 Detection Decision Rule To obtain the PSD distribution for worm detection data, We now describe the method of applying an appropriate we need to transform data from the time domain into the detection rule to detect C-Worm propagation. As the SFM frequency domain. To do so, we use a random process value can be used to sensitively distinguish the C-Worm X(t), t ∈ [0, n] to model the worm detection data. Assuming and normal non-worm scan trafﬁc, the worm detection is X(t) is the source count in time period [t − 1, t] (t ∈ [1, n]), performed by comparing the SFM with a predeﬁned threshold we deﬁne the auto-correlation of X(t) by Tr . If the SFM value is smaller than a predeﬁned threshold RX (L) = E[X(t)X(t + L)]. (5) Tr , then a C-Worm propagation alert is generated. The value of the threshold Tr used by the C-Worm detection can be In Formula (5), RX (L) is the correlation of worm detection ﬁttingly set based on the knowledge of statistical distribution data in an interval L. If a recurring behavior exists, a Fourier (e.g., PDF) of SFM values that correspond to the non-worm transform of the auto-correlation function of RX (L) can reveal scan trafﬁc. Notice that the Tr value for the non-worm trafﬁc such behavior. Thus, the PSD function (also represented by can be derived by analyzing the historical data provided by SX (f ); where f refers to frequency) of the scan trafﬁc data SANs Internet Storm Center (ISC). In the worm detection is determined using the Discrete Fourier Transform (DFT) of systems, monitors collect port-scan trafﬁc to certain area of its auto-correlation function as follows, dark IP addresses and periodically reports scan trafﬁc log to N −1 the data center. Then the data center aggregates the data from ψ(RX [L], K) = (RX [L]) · e−j2πKn/N , (6) different monitors on the same port and publishes the data. n=0 Based on the historical data for different ports, we can build where K = 0, 1, . . . , N − 1. the statistical proﬁles of port-scan trafﬁc on different ports and then derive the Tr value for the non-worm trafﬁc. Based on As the PSD inherently captures any recurring pattern in the the continuous reported data, the value of Tr will be tuned frequency domain, the PSD function shows a comparatively and adaptively used to carry out worm detection. even distribution across a wide spectrum range for the normal non-worm scan trafﬁc. The PSD of C-Worm scan trafﬁc shows If we can obtain the PDF of SFM values for the C- spikes or noticeably higher concentrations at a certain range Worm through comprehensive simulations and even real-world of the spectrum. proﬁled data in the future, the optimal threshold can be obtained by applying the Bayes classiﬁcation . If the PDF of SFM values for the C-Worm is not available, based on the 4.2.2 Spectral Flatness Measure (SFM) PDF of SFM values of the normal non-worm scan trafﬁc, we We measure the ﬂatness of P SD to distinguish the scan trafﬁc can set an appropriate Tr value. For example, the Tr value of the C-Worm from the normal non-worm scan trafﬁc. For can be determined by the Chebyshev inequality  in order this, we introduce the Spectral Flatness Measure (SFM), which to obtain a reasonable false positive rate for worm detection. can capture anomaly behavior in certain range of frequencies. Hence in Section 5, we evaluate our spectrum-based detection The SFM is deﬁned as the ratio of the geometric mean to the scheme against the C-Worm on two cases: (a) the PDF of SFM arithmetic mean of the PSD coefﬁcients , . It can be values are known for both the normal non-worm scan trafﬁc Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON DEPEDABLE AND SECURE COMPUTING IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING ,VOL. 8, NO. 3, MAY-JUNE 2011 9 and the C-Worm scan trafﬁc, (b) the PDF of SFM values is the detection speed of a detection scheme. M IR deﬁnes the only known for the normal non-worm scan trafﬁc. ratio of an infected computer number over the total number In addition, our spectrum-based scheme is also generic for of vulnerable computers up to the moment when the worm detecting the PRS worms. This is due to the fact that propa- spreading is detected. It quantiﬁes the damage caused by a gation trafﬁc of PRS worms has an exponentially increasing worm before being detected. The objective of any detection pattern. Thus, in the propagation trafﬁc of PRS worms, the scheme is to minimize the damage caused by a rapid worm PSD values in the low frequency range are much higher propagation. Hence, M IR and DT can be used to quantify compared with other frequency ranges. A formal analysis of the effectiveness of any worm detection scheme. The higher SFM for the PRS worm is presented in Appendix C. the values, the more effective the worm attack and the less Notice that even if the C-Worm monitors the port-scan effective the detection. In addition, we use two more metrics - trafﬁc report, it will be hard for the C-Worm to make the Detection Rate (PD ) and False Positive Rate (PF ). The PD is SFM similar to the background trafﬁc. This can be reasoned deﬁned as the probability that a detection scheme can correctly by two factors. First, the low value of SFM is mainly caused by identify a worm attack. The PF is deﬁned as the probability the closed-loop control nature of C-worm. The concentration that a detection scheme mistakenly identiﬁes a non-existent within a narrow range of frequencies is unavoidable since the worm attack. C-Worm adapts to the dynamics of the Internet in a recurring manner for manipulating the overall scan trafﬁc volume. Based 5.1.2 Simulation Setup on our analysis, the non-worm trafﬁc on a port is rather random In our evaluation we considered both experiments with real- and its SFM has a ﬂat pattern. That means that the non-worm world “non-worm” trafﬁc and simulated c-worm trafﬁc. To trafﬁc on the port distributes similar power across different make our experiments reﬂect real-world practice, some key frequencies. Second, as we indicated in other responses, with- parameters that we used to generate C-worm trafﬁc in our out introducing the closed-loop control, it will be difﬁcult for simulation were based on previous results from a real-worm the attacker to hide the irregularity of worm propagation trafﬁc incidence - “Code-Red” worm in 2001 . Speciﬁcally, we in the time domain. When the worm attacks incorporate the set the total number of vulnerable computers on the Internet closed-loop control mechanism to camouﬂage their trafﬁc, it as 360,000, which is the maximum number of computers will expose a relative small value of SFM. Hence, integrating which could be infected by “Code-Red” worm. Additionally, our spectrum-based detection with existing trafﬁc rate-based we set the scan rate S (number of scans per minute) to anomaly detection in the time domain, we can force the worm be variable within a range, this allows us to emulate the attacker into a dilemma: if the worm attacker does not use the infected computers in different network environments. In our closed-loop control, the existing trafﬁc rate-based detection evaluation, the scan rates are predetermined and follow a 2 2 scheme will be able to detect the worm; if the worm attacker Gaussian distribution S = N (Sm , Sσ ), where Sm and Sσ are adopt the closed-loop control, it will cause the relatively small in [(20, 70], similar to those used in . In our evaluation, SFM due to the process of closed-loop control. This makes we merged the simulated C-worm attack trafﬁc into replayed the worm attack to be detected by our spectrum-based scheme “non-worm” trafﬁc traces and carried out evaluation study. along with other existing trafﬁc-rate based detection schemes. We simulate the C-Worm attacks by varying the attack parameters, such as attack probability (P (t)) and the number 5 P ERFORMANCE E VALUATION ¯ of worm instances participating in the scan (MC ) deﬁned in ¯ Section 3. The MC follows the Gaussian distribution N (m, σ) In this section, we report our evaluation results that illustrate the effectiveness of our spectrum-based detection scheme and are changed dynamically by the C-Worm during its against both the C-Worm and the PRS worm in comparison propagation. Particularly, for N (m, σ), m is randomly selected with existing representative detection schemes for detecting in (12000, 75000) and σ is randomly selected in (0.2, 100). wide-spreading worms. In addition, we also take into consid- We simulate different C-Worm attacks by varying the values eration destination distribution based detection schemes and of m and σ. The detection sampling window Ws is set to evaluate their performance against the C-Worm. 5 minutes and the detection sliding window Wd is set to be incremental from 80 min to 800 min. The incremental selection of Ws from a comparatively small window to a large 5.1 Evaluation Methodology window can adaptively reﬂect the worm scan trafﬁc dynamics 5.1.1 Evaluation Metrics caused by the C-Worm propagation at various speeds. We In order to evaluate the performance of any given detection choose the setting of the detection sampling window to be scheme against the C-Worm, we use the following three short enough in order to provide enough sampling accuracy metrics listed in Table II. The ﬁrst metric is the worm Infection as prescribed by Nyquist’s sampling theory. Also, we choose Ratio (IR), which is deﬁned as the ratio of the number of the detection sliding window to be long enough to capture infected computers to the total number of vulnerable comput- adequate information for spectrum-based analysis . ers, assuming there is no worm detection/defense system in In practice, since detection systems analyze port scan trafﬁc place. The other two metrics are the Detection Time (DT ) blended with the non-worm scan trafﬁc, we replay the real- and the Maximal Infection Ratio (M IR). DT is deﬁned as world traces as non-worm scan trafﬁc (background noise to the time taken to successfully detect a wide-spreading worm attack trafﬁc) in our simulations. In particular, we used the from the moment the worm propagation starts. It quantiﬁes ISC real-world trace (Shield logs dataset) from 01/01/2005 Authorized licensed use limited to: Universidade de Macau. Downloaded on July 16,2010 at 02:00:53 UTC from IEEE Xplore. Restrictions apply.