Detecting spam zombies by monitoring outgoing messages.bak


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Detecting spam zombies by monitoring outgoing messages.bak

  1. 1. 198 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 2, MARCH/APRIL 2012 Detecting Spam Zombies by Monitoring Outgoing Messages Zhenhai Duan, Senior Member, IEEE, Peng Chen, Fernando Sanchez, Yingfei Dong, Member, IEEE, Mary Stephenson, and James Michael Barker Abstract—Compromised machines are one of the key security threats on the Internet; they are often used to launch various security attacks such as spamming and spreading malware, DDoS, and identity theft. Given that spamming provides a key economic incentive for attackers to recruit the large number of compromised machines, we focus on the detection of the compromised machines in a network that are involved in the spamming activities, commonly known as spam zombies. We develop an effective spam zombie detection system named SPOT by monitoring outgoing messages of a network. SPOT is designed based on a powerful statistical tool called Sequential Probability Ratio Test, which has bounded false positive and false negative error rates. In addition, we also evaluate the performance of the developed SPOT system using a two-month e-mail trace collected in a large US campus network. Our evaluation studies show that SPOT is an effective and efficient system in automatically detecting compromised machines in a network. For example, among the 440 internal IP addresses observed in the e-mail trace, SPOT identifies 132 of them as being associated with compromised machines. Out of the 132 IP addresses identified by SPOT, 126 can be either independently confirmed (110) or highly likely (16) to be compromised. Moreover, only seven internal IP addresses associated with compromised machines in the trace are missed by SPOT. In addition, we also compare the performance of SPOT with two other spam zombie detection algorithms based on the number and percentage of spam messages originated or forwarded by internal machines, respectively, and show that SPOT outperforms these two detection algorithms. Index Terms—Compromised machines, spam zombies, compromised machine detection algorithms. Ç1 INTRODUCTIONA major security challenge on the Internet is the existence Given that spamming provides a critical economic incentive of the large number of compromised machines. Such for the controllers of the compromised machines to recruitmachines have been increasingly used to launch various these machines, it has been widely observed that manysecurity attacks including spamming and spreading mal- compromised machines are involved in spamming [17], [19],ware, DDoS, and identity theft [1], [10], [14]. Two natures of [25]. A number of recent research efforts have studied thethe compromised machines on the Internet—sheer volume aggregate global characteristics of spamming botnets (net-and widespread—render many existing security counter- works of compromised machines involved in spamming)measures less effective and defending attacks involving such as the size of botnets and the spamming patterns ofcompromised machines extremely hard. On the other hand, botnets, based on the sampled spam messages received at aidentifying and cleaning compromised machines in a large e-mail service provider [25], [26].network remain a significant challenge for system adminis- Rather than the aggregate global characteristics of spam-trators of networks of all sizes. ming botnets, we aim to develop a tool for system In this paper, we focus on the detection of the compro- administrators to automatically detect the compromisedmised machines in a network that are used for sending spam machines in their networks in an online manner. We considermessages, which are commonly referred to as spam zombies. ourselves situated in a network and ask the following question: How can we automatically identify the compro-. Z. Duan and F. Sanchez are with the Department of Computer Science, mised machines in the network as outgoing messages pass Florida State University, Tallahassee, FL 32306-4530. the monitoring point sequentially? The approaches devel- E-mail: {duan, sanchez} oped in the previous work [25], [26] cannot be applied here.. P. Chen is with Juniper Networks, Sunnyvale, CA 94089. The locally generated outgoing messages in a network E-mail: Y. Dong is with the Department of Electrical Engineering, University of normally cannot provide the aggregate large-scale spam Hawaii, 2540 Dole Street, Holmes Hall 483, Honolulu, HI 96822. view required by these approaches. Moreover, these ap- E-mail: proaches cannot support the online detection requirement in. M. Stephenson is with Information Technology Services, Florida State University, 2035 E. Paul Dirac Drive, 200 BFS, Tallahassee, FL 32310. the environment we consider. E-mail: The nature of sequentially observing outgoing messages. J.M. Barker is with Information Technology Services, The University of gives rise to the sequential detection problem. In this paper, North Carolina at Chapel Hill, 211 Manning Drive, Chapel Hill, NC we will develop a spam zombie detection system, named 27599-3420. E-mail: SPOT, by monitoring outgoing messages. SPOT is designedManuscript received 27 Aug. 2010; revised 9 May 2011; accepted 9 June 2011; based on a statistical method called Sequential Probabilitypublished online 30 Sept. 2011.For information on obtaining reprints of this article, please send e-mail to: Ratio Test (SPRT), developed by Wald in his seminal, and reference IEEECS Log Number TDSC-2010-08-0149. [21]. SPRT is a powerful statistical method that can be usedDigital Object Identifier no. 10.1109/TDSC.2011.49. to test between two hypotheses (in our case, a machine is 1545-5971/12/$31.00 ß 2012 IEEE Published by the IEEE Computer Society
  2. 2. DUAN ET AL.: DETECTING SPAM ZOMBIES BY MONITORING OUTGOING MESSAGES 199compromised versus the machine is not compromised), as aggregate global characteristics of spamming botnets bythe events (in our case, outgoing messages) occur sequen- clustering spam messages received at the provider intotially. As a simple and powerful statistical method, SPRT spam campaigns using embedded URLs and near-duplicatehas a number of desirable features. It minimizes the content clustering, respectively. However, their approachesexpected number of observations required to reach a are better suited for large e-mail service providers todecision among all the sequential and nonsequential understand the aggregate global characteristics of spam-statistical tests with no greater error rates. This means that ming botnets instead of being deployed by individualthe SPOT detection system can identify a compromisedmachine quickly. Moreover, both the false positive and networks to detect internal compromised machines. More-false negative probabilities of SPRT can be bounded by over, their approaches cannot support the online detectionuser-defined thresholds. Consequently, users of the SPOT requirement in the network environment considered in thissystem can select the desired thresholds to control the paper. We aim to develop a tool to assist system adminis-false positive and false negative rates of the system. trators in automatically detecting compromised machines in In this paper, we develop the SPOT detection system to their networks in an online manner.assist system administrators in automatically identifying Xie et al. developed an effective tool named DBSpam tothe compromised machines in their networks. We also detect proxy-based spamming activities in a networkevaluate the performance of the SPOT system based on a relying on the packet symmetry property of such activitiestwo-month e-mail trace collected in a large US campus [23]. We intend to identify all types of compromisednetwork. Our evaluation studies show that SPOT is an machines involved in spamming, not only the spam proxieseffective and efficient system in automatically detecting that translate and forward upstream non-SMTP packets (forcompromised machines in a network. For example, among example, HTTP) into SMTP commands to downstream mailthe 440 internal IP addresses observed in the e-mail trace, servers as in [23].SPOT identifies 132 of them as being associated withcompromised machines. Out of the 132 IP addresses In the following, we discuss a few schemes on detectingidentified by SPOT, 126 can be either independently general botnets. BotHunter [8], developed by Gu et al.,confirmed (110) or are highly likely (16) to be compromised. detects compromised machines by correlating the IDSMoreover, only seven internal IP addresses associated with dialog trace in a network. It was developed based on thecompromised machines in the trace are missed by SPOT. In observation that a complete malware infection process has aaddition, SPOT only needs a small number of observations number of well-defined stages including inbound scanning,to detect a compromised machine. The majority of spam exploit usage, egg downloading, outbound bot coordinationzombies are detected with as little as three spam messages. dialog, and outbound attack propagation. By correlatingFor comparison, we also design and study two other spam inbound intrusion alarms with outbound communicationszombie detection algorithms based on the number of spam patterns, BotHunter can detect the potential infectedmessages and the percentage of spam messages originated machines in a network. Unlike BotHunter which relies onor forwarded by internal machines, respectively. We the specifics of the malware infection process, SPOT focusescompare the performance of SPOT with the two other on the economic incentive behind many compromiseddetection algorithms to illustrate the advantages of theSPOT system. machines and their involvement in spamming. The remainder of the paper is organized as follows: in An anomaly-based detection system named BotSniffer [9]Section 2, we discuss related work in the area of botnet identifies botnets by exploring the spatial-temporal beha-detection (focusing on spam zombie detection schemes). We vioral similarity commonly observed in botnets. It focusesformulate the spam zombie detection problem in Section 3. on IRC-based and HTTP-based botnets. In BotSniffer, flowsSection 4 provides the necessary background on SPRT for are classified into groups based on the common server thatdeveloping the SPOT spam zombie detection system. In they connect to. If the flows within a group exhibitSection 5, we provide the detailed design of SPOT and the behavioral similarity, the corresponding hosts involved aretwo other detection algorithms. Section 6 evaluates the detected as being compromised. BotMiner [7] is one of theSPOT detection system based on the two-month e-mail first botnet detection systems that are both protocol andtrace, and contrasts its performance with the two other structure independent. In BotMiner, flows are classified intodetection algorithms. We briefly discuss the practical groups based on similar communication patterns anddeployment issues and potential evasion techniques in similar malicious activity patterns, respectively. The inter-Section 7, and conclude the paper in Section 8. section of the two groups is considered to be compromised machines. Compared to general botnet detection systems2 RELATED WORK such as BotHunter, BotSniffer, and BotMiner, SPOT is aIn this section, we discuss related work in detecting lightweight compromised machine detection scheme, bycompromised machines. We first focus on the studies that exploring the economic incentives for attackers to recruit theutilize spamming activities to detect bots and then briefly large number of compromised machines.discuss a number of efforts in detecting general botnets. As a simple and powerful statistical method, Sequential Based on e-mail messages received at a large e-mail Probability Ratio Test has been successfully applied in manyservice provider, two recent studies [25], [26] investigated areas [22]. In the area of networking security, SPRT has beenthe aggregate global characteristics of spamming botnets used to detect portscan activities [12], proxy-based spam-including the size of botnets and the spamming patterns of ming activities [23], anomaly-based botnet detection [9], andbotnets. These studies provided important insights into the MAC protocol misbehavior in wireless networks [16].
  3. 3. 200 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 2, MARCH/APRIL 2012 only have a marginal impact on the performance of the spam zombie detection algorithms (see Section 6). We also assume that a sending machine m as observed by the spam zombie detection system is an end-user client machine. It is not a mail relay server. This assumption is just for the convenience of our exposition. The proposed SPOT system can handle the case where an outgoing message is forwarded by a few internal mail relay servers before leaving the network. We discuss practical deployment issues in Section 7.Fig. 1. Network model. In addition, we assume that an IP address corresponds to a unique machine and ignores the potential impacts of3 PROBLEM FORMULATION AND ASSUMPTIONS dynamic IP addresses on the detection algorithms in theIn this section, we formulate the spam zombie detection presentation of the algorithms [3], [24]. We will investigateproblem in a network. In particular, we discuss the network the potential impacts of dynamic IP addresses on themodel and assumptions we make in the detection problem. detection algorithms in Sections 5 and 6. Fig. 1 illustrates the logical view of the network model.We assume that messages originated from machines inside 4 BACKGROUND ON SEQUENTIAL PROBABILITYthe network will pass the deployed spam zombie detection RATIO TESTsystem. This assumption can be achieved in a few differentscenarios. For example, the outgoing e-mail traffic (with In this section, we provide the necessary background ondestination port number of 25) can be replicated and the Sequential Probability Ratio Test for understanding theredirected to the spam zombie detection system. proposed spam zombie detection system. Interested read- A machine in the network is assumed to be either ers are directed to [21] for a detailed discussion on thecompromised or normal (that is, not compromised). In this topic of SPRT.paper, we only focus on the compromised machines that are In its simplest form, SPRT is a statistical method forinvolved in spamming. Therefore, we use the term a testing a simple null hypothesis against a single alternative hypothesis. Intuitively, SPRT can be considered as an one-compromised machine to denote a spam zombie, and use the dimensional random walk with two user-specified bound-two terms interchangeably. Let Xi for i ¼ 1; 2; . . . denote the the two hypotheses. As the samplessuccessive observations of a random variable X correspond- aries corresponding to of the concerned random variable arrive sequentially, theing to the sequence of messages originated from machine m walk moves either upward or downward one step,inside the network. We let Xi ¼ 1 if message i from the depending on the value of the observed sample. Whenmachine is a spam, and Xi ¼ 0 otherwise. The detection the walk hits or crosses either of the boundaries for the firstsystem assumes that the behavior of a compromised time, the walk terminates and the corresponding hypoth-machine is different from that of a normal machine in esis is selected.terms of the messages they send. Specifically, a compro- As a simple and powerful statistical tool, SPRT has amised machine will with a higher probability generate a number of compelling and desirable features that lead tospam message than a normal machine. Formally, the widespread applications of the technique in many areas [22]. First, both the actual false positive and false negative P rðXi ¼ 1jH1 Þ > P rðXi ¼ 1jH0 Þ; ð1Þ probabilities of SPRT can be bounded by the user-specifiedwhere H1 denotes that machine m is compromised and H0 error rates. A smaller error rate tends to require a largerthat the machine is normal. The spam zombie detection number of observations before SPRT terminates. Thus,problem can be formally stated as follows: as Xi arrives users can balance the performance (in terms of false positivesequentially at the detection system, the system determines and false negative rates) and cost (in terms of number ofwith a high probability if machine m has been compro- required observations) of an SPRT test. Second, it has beenmised. Once a decision is reached, the detection system proved that SPRT minimizes the average number of thereports the result, and further actions can be taken, e.g., to required observations for reaching a decision for a givenclean the machine. error rate, among all sequential and nonsequential statis- We assume that a (content-based) spam filter is deployed tical tests. In the following, we present the formal definitionat the detection system so that an outgoing message can be and a number of important properties of SPRT. The detailedclassified as either a spam or nonspam [20]. None of existing derivations of the properties can be found in [21].spam filters can achieve perfect spam detection accuracy, Let X denote a Bernoulli random variable underand they all suffer from both false positive and false negative consideration with an unknown parameter , and X1 ; X2 ; . . . the successive observations on X. As discussederrors. The false negative rate of a spam filter measures above, SPRT is used for testing a simple hypothesis H0 thatthe percentage of spam messages that are misclassified, and ¼ 0 against a single alternative H1 that ¼ 1 . That is,the false positive rate measures the percentage of nonspammessages that are misclassified. We note that all widely P rðXi ¼ 1jH0 Þ ¼ 1 À P rðXi ¼ 0jH0 Þ ¼ 0deployed spam filters have very low false negative and false P rðXi ¼ 1jH1 Þ ¼ 1 À P rðXi ¼ 0jH1 Þ ¼ 1 :positive rates, and such small spam classification errors will
  4. 4. DUAN ET AL.: DETECTING SPAM ZOMBIES BY MONITORING OUTGOING MESSAGES 201To ease exposition and practical computation, we compute addition, (7) specifies that the actual false positive rate andthe logarithm of the probability ratio instead of the the false negative rate cannot be both larger than theprobability ratio in the description of SPRT. For any positive corresponding desired error rate in a given experiment.integer n ¼ 1; 2; . . . , define Therefore, in all practical applications, we can compute the boundaries A and B using (5), given the user-specified false P rðX1 ; X2 ; . . . ; Xn jH1 Þ positive and false negative rates. This will provide at least Ãn ¼ ln : ð2Þ P rðX1 ; X2 ; . . . ; Xn jH0 Þ the same protection against errors as if we use the preciseAssume that Xi ’s are independent (and identically dis- values of A and B for a given pair of desired error rates.tributed), we have The precise values of A and B are hard to obtain. Another important property of SPRT is the number of Qn P rðXi jH1 Þ X P rðXi jH1 Þ X n n observations, N, required before SPRT reaches a decision. Ãn ¼ ln Q1 n ¼ ln ¼ Zi ; ð3Þ The following two equations approximate the average 1 P rðXi jH0 Þ i¼1 P rðXi jH0 Þ i¼1 number of observations required when H1 and H0 arewhere Zi ¼ ln P rðXii jH1 Þ , which can be considered as the step P rðX jH0 Þ true, the random walk represented by Ã. When the observa-tion is 1 (Xi ¼ 1), the constant ln 1 is added to the
  5. 5. ln 1À þ ð1 À
  6. 6. Þln 1À
  7. 7. 0 E½NjH1 Š ¼ ; ð8Þpreceding value of Ã. When the observation is 0 (Xi ¼ 0), 1 ln 1 þ ð1 À 1 Þln 1À1 0 1À0the constant ln 1À1 is added. 1À0 The Sequential Probability Ratio Test for testing H0 ð1 À Þln 1À þ ln 1À
  8. 8. against H1 is then defined as follows: given two user- E½NjH0 Š ¼ : ð9Þ 1 ln 1 þ ð1 À 1 Þln 1À1 0 1À0specified constants A and B where A B, at each stage n ofthe Bernoulli experiment, the value of Ãn is computed as in From the above equations, we can see that the average(3), then number of required observations when H1 or H0 is true depends on four parameters: the desired false positive and Ãn A ¼ accept H0 and terminate test; ) negative rates ( and
  9. 9. ), and the distribution parameters 1 Ãn ! B ¼ accept H1 and terminate test; ) and 0 for hypotheses H1 and H0 , respectively. We note that ð4Þ A Ãn B ¼ take an additional observation ) SPRT does not require the precise knowledge of the and continue experiment: distribution parameters 1 and 0 . As long as the true distribution of the underlying random variable is suffi- In the following, we describe a number of important ciently close to one of hypotheses compared to another (thatproperties of SPRT. If we consider H1 as a detection and H0 is, is closer to either 1 or 0 ), SPRT will terminate with theas a normality, an SPRT process may result in two types of bounded error rates. An imprecise knowledge of 1 and 0errors: false positive where H0 is true but SPRT accepts H1 will only affect the number of required observations forand false negative where H1 is true but SPRT accepts H0 . SPRT to reach a decision.We let and
  10. 10. denote the user-desired false positive andfalse negative probabilities, respectively. There exist some 5 SPAM ZOMBIE DETECTION ALGORITHMSfundamental relations among ,
  11. 11. , A, and B [21], In this section, we will develop three spam zombie
  12. 12.
  13. 13. detection algorithms. The first one is SPOT, which utilizes A ! ln ; B ln ; 1À the Sequential Probability Ratio Test presented in the lastfor most practical purposes, we can take the equality, that is, section. We discuss the impacts of SPRT parameters on SPOT in the context of spam zombie detection. The other
  14. 14.
  15. 15. A ¼ ln ; B ¼ ln : ð5Þ two spam zombie detection algorithms are developed based 1À on the number of spam messages and the percentage ofThis will only slightly affect the actual error rates. Formally, spam messages sent from an internal machine, respectively.let 0 and
  16. 16. 0 represent the actual false positive rate and the 5.1 SPOT Detection Algorithmactual false negative rate, respectively, and let A and B be SPOT is designed based on the statistical tool SPRT wecomputed using (5), then the following relations hold: discussed in the last section. In the context of detecting spam
  17. 17. zombies in SPOT, we consider H1 as a detection and H0 as a 0 ;
  18. 18. 0 ; ð6Þ normality. That is, H1 is true if the concerned machine is 1À
  19. 19. 1À compromised, and H0 is true if it is not compromised. Inand addition, we let Xi ¼ 1 if the ith message from the 0 þ
  20. 20. 0 þ
  21. 21. : ð7Þ concerned machine in the network is a spam, and Xi ¼ 0 otherwise. Recall that SPRT requires four configurable Equations (6) and (7) provide important bounds for 0 parameters from users, namely, the desired false positiveand
  22. 22. 0 . In all practical applications, the desired false probability , the desired false negative probability
  23. 23. , thepositive and false negative rates will be small, for example, probability that a message is a spam when H1 is true (1 ),
  24. 24. in the range from 0.01 to 0.05. In these cases, 1À
  25. 25. and 1À and the probability that a message is a spam when H0 is truevery closely equal the desired and
  26. 26. , respectively. In (0 ). We discuss how users configure the values of the four
  27. 27. 202 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 2, MARCH/APRIL 2012Fig. 2. Average number of required observations when H1 is true (
  28. 28. ¼ 0:01).parameters after we present the SPOT algorithm. Based on added into the list of potentially compromised machinesthe user-specified values of and
  29. 29. , the values of the two that system administrators can go after to clean. Theboundaries A and B of SPRT are computed using (5). message-sending behavior of the machine is also recorded In the following, we describe the SPOT detection should further analysis be required. Before the machine isalgorithm. Algorithm 1 outlines the steps of the algorithm. cleaned and removed from the list, the SPOT detectionWhen an outgoing message arrives at the SPOT detection system does not need to further monitor the message-system, the sending machine’s IP address is recorded, and sending behavior of the machine.the message is classified as either spam or nonspam by the On the other hand, a machine that is currently normal(content-based) spam filter. For each observed IP address, may get compromised at a later time. Therefore, we need toSPOT maintains the logarithm value of the corresponding continuously monitor machines that are determined to beprobability ratio Ãn , whose value is updated according to normal by SPOT. Once such a machine is identified by(3) as message n arrives from the IP address (lines 6 to 12 in SPOT, the records of the machine in SPOT are reset, inAlgorithm 1). Based on the relation between Ãn and A and particular, the value of Ãn is set to zero, so that a newB, the algorithm determines if the corresponding machine is monitoring phase starts for the machine (lines 15 to 18). http://ieeexploreprojects.blogspot.comcompromised, normal, or a decision cannot be reached and 5.2 Parameters of SPOT Algorithmadditional observations are needed (lines 13 to 21). SPOT requires four user-defined parameters: ,
  30. 30. , 1 , and 0 .Algorithm 1. SPOT spam zombie detection system In the following, we discuss how a user of the SPOT 1: An outgoing message arrives at SPOT algorithm configures these parameters, and how these 2: Get IP address of sending machine m parameters may affect the performance of SPOT. As 3: // all following parameters specific to machine m discussed in the previous section, and
  31. 31. are the desired 4: Let n be the message index false positive and false negative rates. They are normally 5: Let Xn ¼ 1 if message is spam, Xn ¼ 0 otherwise small values in the range from 0.01 to 0.05, which users of 6: if (Xn ¼¼ 1) then SPOT can easily specify independent of the behaviors of the 7: // spam, 3 compromised and normal machines in the network. As we 1 8: Ãn þ ¼ ln 0 have shown in Section 4, the values of and
  32. 32. will affect the 9: else cost of the SPOT algorithm, that is, the number of10: // nonspam observations needed for the algorithm to reach a conclusion.11: 1À1 Ãn þ ¼ ln 1À0 In general, a smaller value of and
  33. 33. will require a larger12: end if number of observations for SPOT to reach a detection.13: if (Ãn ! B) then Ideally, 1 and 0 should indicate the true probability of a message being spam from a compromised machine and a14: Machine m is compromised. Test terminates for m. normal machine, respectively, which are hard to obtain. A15: else if (Ãn A) then practical way to assign values to 1 and 0 is to use the16: Machine m is normal. Test is reset for m. detection rate and the false positive rate of the spam filter17: Ãn ¼ 0 deployed together with the spam zombie detection system,18: Test continues with new observations respectively. Given that all the widely used spam filters19: else have a high detection rate and low false positive rate [20],20: Test continues with an additional observation values of 1 and 0 assigned in this way should be very close21: end if to the true probabilities. We note that in the context of spam zombie detection, To get some intuitive understanding of the averagefrom the viewpoint of network monitoring, it is more number of required observations for SPRT to reach aimportant to identify the machines that have been compro- decision, Figs. 2a and 2b show the value of E½NjH1 Š as amised than the machines that are normal. After a machine is function of 0 and 1 , respectively, for different desired falseidentified as being compromised (lines 13 and 14), it is positive rates. In the figures, we set the false negative
  35. 35. ¼ 0:01. In Fig. 2a, we assume the probability of a each time window: one is the percentage of spam messagesmessage being spam when H1 is true to be 0.9 (1 ¼ 0:9). sent from a machine, another the total number of messages.From the figure, we can see that it only takes a small Let N and n denote the total messages and spam messagesnumber of observations for SPRT to reach a decision. For originated from a machine m within a time window,example, when 0 ¼ 0:2, SPRT requires about three ob- respectively, then PT declares machine m as being nservations to detect that the machine is compromised if the compromised if N ! Ca and N P , where Ca is thedesired false positive rate is 0.01. As the behavior of a minimum number of messages that a machine must send,normal machine gets closer to that of compromised and P is the user-defined maximum spam percentage of amachine, i.e., 0 increases, a slightly higher number of normal machine. The first condition is in place forobservations are required for SPRT to reach a detection. preventing high false positive rates when a machine only In Fig. 2b, we assume the probability of a message being generates a small number of messages. For example, in anspam from a normal machine to be 0.2 (0 ¼ 0:2). From the extreme case, a machine may only send a single messagefigure, we can see that it also only takes a small number of and it is a spam, which renders the machine to have aobservations for SPRT to reach a decision. As the behavior 100 percent spam ratio. However, it does not make sense toof a compromised machine gets closer to that of a normal classify this machine as being compromised based on thismachine, i.e., 1 decreases, a higher number of observations small number of messages generated.are required for SPRT to reach a detection. In the following, we briefly compare the two spam From the figures, we can also see that, as the desired false zombie detection algorithms CT and PT with the SPOTpositive rate decreases, SPRT needs a higher number of system. The three algorithms have the similar running timeobservations to reach a conclusion. The same observation and space complexities. They all need to maintain a recordapplies to the desired false negative rate. These observations for each observed machine and update the correspondingillustrate the trade-offs between the desired performance of record as messages arrive from the machine. However,SPRT and the cost of the algorithm. In the above discussion, unlike SPOT, which can provide a bounded false positivewe only show the average number of required observations rate and false negative rate, and consequently, a confidencewhen H1 is true because we are more interested in the how well SPOT works, the error rates of CT and PT cannotspeed of SPOT in detecting compromised machines. The be a priori on E½NjH0 Š shows a similar trend (not shown). In addition, choosing the proper values for the four user- We note that the statistical tool SPRT assumes events (in defined parameters (,
  36. 36. , 1 , and 0 ) in SPOT is relativelyour cases, outgoing messages) are independently and straightforward (see the related discussion in the previousidentically distributed. However, ithttp://ieeexploreprojects.blogspot.comselecting the “right” values for the is well known that section). In contrast,spam messages belonging to the same campaign are likely parameters of CT and PT is much more challenging andgenerated using the same spam template and delivered in tricky. The performance of the two algorithms is sensitive tobatch; therefore, spam messages observed in time proxi- the parameters used in the algorithm. They require amity may not be independent with each other. This can thorough understanding of the different behaviors of theadversely affect the performance of SPOT in detecting compromised and normal machines in the concernedcompromised machines. As a consequence, both the network and a training based on the behavioral history ofdetection accuracy of SPOT and the number of observations the two different types of machines in order for them torequired to reach a detection decision can differ from the work reasonably well in the network. For example, it can betheoretical bounds provided in the statistical tool SPRT. We challenging to select the “best” length of time windows instudy the performance of SPOT using a real-world e-mail CT and PT to obtain the optimal false positive and falsetrace in Section 6. negative rates. We discuss how an attacker may try to evade CT and PT (and SPOT) in Section 7.5.3 Spam Count and Percentage-Based Detection Algorithms 5.4 Impact of Dynamic IP AddressesFor comparison, in this section, we present two different In the above discussion of the spam zombie detectionalgorithms in detecting spam zombies, one based on the algorithms, we have for simplicity ignored the potentialnumber of spam messages and another the percentage of impact of dynamic IP addresses and assumed that anspam messages sent from an internal machine, respectively. observed IP corresponds to a unique machine. In theFor simplicity, we refer to them as the count-threshold (CT) following, we informally discuss how well the threedetection algorithm and the percentage-threshold (PT) algorithms fair with dynamic IP addresses. We formallydetection algorithm, respectively. evaluate the impacts of dynamic IP addresses on detecting In CT, the time is partitioned into windows of fixed length spam zombies in the next section using a two-month e-mailT . A user-defined threshold parameter Cs specifies the trace collected on a large US campus network.maximum number of spam message that may be originated SPOT can work extremely well in the environment offrom a normal machine in any time window. The system dynamic IP addresses. To understand the reason we notemonitors the number of spam messages n originated from a that SPOT can reach a decision with a small number ofmachine in each window. If n Cs , then the algorithm observations as illustrated in Fig. 2, which shows thedeclares that the machine has been compromised. average number of observations required for SPRT to Similarly, in the PT detection algorithm, the time is terminate with a conclusion. In practice, we have notedpartitioned into windows of fixed length T . PT monitors that three or four observations are sufficient for SPRT totwo e-mail sending properties of each internal machine in reach a decision for the vast majority of cases (see the
  37. 37. 204 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 2, MARCH/APRIL 2012 TABLE 1 TABLE 3 Summary of the E-Mail Trace Summary of IP Addresses Sending Virus/Worm issues, we do not have access to the content of the messages in the trace.performance evaluation of SPOT in the next section). If a Ideally, we should have collected all the outgoingmachine is compromised, it is likely that more than three or messages in order to evaluate the performance of thefour spam messages will be sent before the (unwitting) user detection algorithms. However, due to logistical constraints,shutdowns the machine and the corresponding IP address we were not able to collect all such messages. Instead, wegets reassigned to a different machine. Therefore, dynamic identified the messages in the e-mail trace that have beenIP addresses will not have any significant impact on SPOT. forwarded or originated by the FSU internal machines, that Dynamic IP addresses can have a greater impact on the is, the messages forwarded or originated by an FSU internalother two detection algorithms CT and PT. First, both machine and destined to an FSU account. We refer to this set of messages as the FSU e-mails and perform ourrequire the continuous monitoring of the sending behavior evaluation of the detection algorithms based on the FSUof a machine for at least a specified time window, which in e-mails. We note the set of FSU e-mails does not contain allpractice can be on the order of hours or days. Second, CT the outgoing messages originated from inside FSU, and thealso requires a relatively larger number of spam messages compromised machines identified by the detection algo-to be observed from a machine before reaching a detection. rithms based on the FSU e-mails will likely be a lowerBy properly selecting the values for the parameters of CT bound on the true number of compromised machines insideand PT (for example, a shorter time window for machines FSU campus network.with dynamic IP addresses), they can also work reasonably An e-mail message in the trace is classified as either spamwell in the environment of dynamic IP addresses. or nonspam by SpamAssassin [20] deployed in the FSU mail relay server. For ease of exposition, we refer to the set of all6 PERFORMANCE EVALUATION messages as the aggregate e-mails including both spam and has a known virus/worm attach-In this section, we evaluate the performance of the three nonspam. If a message ment, we refer to such a message as an infected message. Wedetection algorithms based on a two-month e-mail trace refer to an IP address of a sending machine as a spam-only IPcollected on a large US campus network. We also study the address if only spam messages are received from the IPpotential impact of dynamic IP addresses on detecting address. Similarly, we refer to an IP address as nonspam onlyspam zombies. and mixed if we only receive nonspam messages, or we6.1 Overview of the E-Mail Trace and Methodology receive both spam and nonspam messages, respectively, from the IP address.The e-mail trace was collected at a mail relay server deployed Table 1 shows a summary of the e-mail trace. As shownin the Florida State University (FSU) campus network in the table, the trace contains more than 25 M e-mails, ofbetween 8/25/2005 and 10/24/2005, excluding 9/11/2005 which more than 18 M, or about 73 percent, are spam.(we do not have trace on this date). During the course of the About half of the messages in the e-mail trace weree-mail trace collection, the mail server relayed messages originated or forwarded by FSU internal machines, i.e.,destined for 53 subdomains in the FSU campus network. The contained in the set of FSU e-mails. Table 2 shows themail relay server ran SpamAssassin [20] to detect spam classifications of the observed IP addresses. As shown in themessages. The e-mail trace contains the following informa- table, during the course of the trace collection, we observedtion for each incoming message: the local arrival time, the IP more than 2 M IP addresses (2;461;114) of sendingaddress of the sending machine (i.e., the upstream mail machines, of which more than 95 percent sent at least oneserver that delivered the message to the FSU mail relay spam message. During the same course, we observed 440server), and whether or not the message is spam. In addition, FSU internal IP addresses.if a message has a known virus/worm attachment, it was so Table 3 shows the classification of the observed IPindicated in the trace by an antivirus software. The antivirus addresses that sent at least one message carrying a virus/software and SpamAssassin were two independent compo- worm attachment. We note that a higher proportion of FSUnents deployed on the mail relay server. Due to privacy internal IP addresses sent e-mails with a virus/worm TABLE 2 Summary of Sending IP Addresses
  38. 38. DUAN ET AL.: DETECTING SPAM ZOMBIES BY MONITORING OUTGOING MESSAGES 205 TABLE 4 Performance of SPOTFig. 3. Illustration of message clustering. 440 FSU internal IP addresses observed in the e-mail trace.attachment than the overall IP addresses observed (all SPOT identifies 132 of them to be associated withe-mails were destined to FSU accounts). This could be compromised machines. In order to understand thecaused by a few factors. First, a (compromised) e-mail performance of SPOT in terms of the false positive andaccount in general maintains more e-mail addresses of false negative rates, we rely on a number of ways to verify iffriends in the same domain than other remote domains. a machine is indeed compromised. First, we check if anySecond, an (e-mail-propagated) virus/worm may adopt a message sent from an IP address carries a known virus/spreading strategy concentrating more on local targets [2]. worm attachment. If this is the case, we say we have aMore detailed analysis of the e-mail trace can be found in confirmation. Out of the 132 IP addresses identified by[5] and [6], including the daily message arrival patterns, SPOT, we can confirm 110 of them to be compromised inand the behaviors of spammers at both the mail-server level this way. For the remaining 22 IP addresses, we manuallyand the network level. examine the spam sending patterns from the IP addresses In order to study the potential impacts of dynamic and the domain names of the corresponding machines. IfIP addresses on the detection algorithms, we obtain the the fraction of the spam messages from an IP address issubset of FSU IP addresses in the trace whose domain high (greater than 98 percent), we also claim that thenames contain “wireless,” which normally have dynami- corresponding machine has been confirmed to be compro-cally allocated IP addresses. For each of the IP addresses, mised. We can confirm 16 of them to be compromised inwe group the messages sent from the IP address into this way. We note that the majority (62.5 percent) of theclusters, where the messages in each cluster are likely to be IP addresses confirmed by the spam percentage arefrom the same machine (before the IP address is reassigned dynamic IP addresses, which further indicates the like-to a different machine). We group messages according to lihood of the machines to be compromised.the interarrival times between consecutive messages, as For the remaining six IP addresses that we cannotdiscussed below. Let mi for i ¼ 1; 2; . . . denote the messages confirm by either of the above means, we have alsosent from an IP address, and ti denote the time when manually examined their sending patterns. We note that,message i is received. Then, messages mi for i ¼ 1; 2; . . . ; k they have a relatively overall low percentage of spambelong to the same cluster if jti À tiÀ1 j T for i ¼ 2; 3; . . . ; k, messages over the two month of the collection period.and jtkþ1 À tk j T , where T is a user-defined time interval. However, they sent substantially more spam messagesWe repeat the same process to group other messages. Let mi toward the end of the collection period. This indicates thatfor i ¼ j; j þ 1; . . . ; k be the sequence of messages in a they may get compromised toward the end of our collectioncluster, arriving in that order. Then, jtk À tj j is referred to as period. However, we cannot independently confirm if thisthe duration of the cluster, and jtkþ1 À tk j is referred to as thetime interval between two clusters. is the case. Fig. 3 illustrates the message clustering process. The Evaluating the false negative rate of SPOT is a bit trickyintuition is that, if two messages come closely in time from by noting that SPOT focuses on the machines that arean IP address (within a time interval T ), it is unlikely that potentially compromised, but not the machines that arethe IP address has been assigned to two different machines normal (see Section 5). In order to have some intuitivewithin the short time interval. understanding of the false negative rate of the SPOT system, In the evaluation studies, we whitelist the known mail we consider the machines that SPOT does not identify asservers deployed on the FSU campus network, given that being compromised at the end of the e-mail collectionthey are unlikely to be compromised. If a deployed mail period, but for which SPOT has reset the records (lines 15 toserver forwards a large number of spam messages, it is 18 in Algorithm 1). That is, such machines have beenmore likely that machines behind the mail server are claimed as being normal by SPOT (but have continuouslycompromised. However, just based on the information been monitored). We also obtain the list of IP addresses thatavailable in the e-mail trace, we cannot decide which have sent at least a message with a virus/worm attachment.machines are responsible for the large number of spam Seven of such IP addresses have been claimed as beingmessages, and consequently, determine the compromised normal, i.e., missed, by SPOT.machines. Section 7 discusses how we can handle this case We emphasize that the infected messages are only usedin practical deployment. to confirm if a machine is compromised in order to study the performance of SPOT. Infected messages are not used6.2 Performance of SPOT by SPOT itself. SPOT relies on the spam messages insteadIn this section, we evaluate the performance of SPOT based of infected messages to detect if a machine has beenon the collected FSU e-mails. In all the studies, we set compromised to produce the results in Table 4. We make ¼ 0:01,
  39. 39. ¼ 0:01, 1 ¼ 0:9, and 0 ¼ 0:2. this decision by noting that, it is against the interest of a Table 4 shows the performance of the SPOT spam professional spammer to send spam messages with azombie detection system. As discussed above, there are virus/worm attachment. Such messages are more likely to
  40. 40. 206 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 2, MARCH/APRIL 2012 TABLE 5 Performance of CT and PT an internal machine (see Section 5.3). For comparison, we also include a simple spam zombie detection algorithm that identifies any machine sending at least a single spam message as a compromised machine. We note that such aFig. 4. Number of actual observations. simple scheme may not be deployed due to a potential high false positive rate; however, it provides us with the insightsbe detected by antivirus softwares, and hence deleted into the performance gain we can obtain by employing abefore reaching the intended recipients. This is confirmed more sophisticated spam zombie detection the low percentage of infected messages in the overall In this evaluation study, we set the length of timee-mail trace shown in Table 1. Infected messages are more windows to be 1 hour, that is, T ¼ 1 hour, for both CT andlikely to be observed during the spam zombie recruitment PT. For CT, we set the maximum number of spam messagesphase instead of spamming phase. Infected messages can that a normal machine can send within a time window to bebe easily incorporated into the SPOT system to improve 30 (Cs ¼ 3), that is, when a machine sends more thanits performance. 30 spam messages within any time windows, CT concludes We note that both the actual false positive rate and the that the machine is compromised. In PT, we set thefalse negative rate are higher than the specified false minimum number of (spam and nonspam) messages withinpositive rate and false negative rate, respectively. A number a time window to be 6 (Ca ¼ 6), and the maximumof factors can contribute to this observation. As we have percentage of spam messages within a time window to bediscussed in Section 5, the IID assumption of the underlying 50 percent (P ¼ 50%). That is, if more than 50 percent of allstatistical tool SPRT may not be satisfied in the spam messages sent from a machine are spam in any timezombie detection problem. In addition, the spam filter used window with at least six messages in the window, PT willto classify messages did not have a perfect spam detection conclude that the machine is compromised. We choose the http://ieeexploreprojects.blogspot.comaccuracy, and as a consequence, events may be misclassi- values for the parameters of PT in this way so that it isfied. Moreover, the FSU e-mails can only provide a rather relatively comparable with SPOT. Recall that based on ourlimited view of the outgoing messages originated from empirical study in the last section, the minimum number ofinside FSU, which will also affect the performance of SPOT observations needed by SPOT to reach a detection is 3in detection spam zombie machines. (when ¼ 0:01,
  41. 41. ¼ 0:01, 0 ¼ 0:2, and 1 ¼ 0:9). Fig. 4 shows the distributions of the number of actual Table 5 shows the performance of CT and PT, whichobservations that SPOT takes to detect the compromised includes the number of compromised IP addresses de-machines. As we can see from the figure, the vast majority tected, confirmed, and missed. We use the same methods toof compromised machines can be detected with a small confirm a detection or identify a missed IP address as wenumber of observations. For example, more than 80 percent have done with the SPOT detection algorithm. From theof the compromised machines are detected by SPOT with table we can see that, CT and PT have a worse performanceonly three observations. All the compromised machines are than SPOT. For example, CT only detects 81 IP addresses asdetected with no more than 11 observations. This indicates being compromised. Among the 81 IP addresses, 79 can bethat, SPOT can quickly detect the compromised machines. confirmed to be associated with compromised machines.We note that SPOT does not need compromised machines However, CT missed detecting 53 IP addresses associatedto send spam messages at a high rate in order to detect with compromised machines. The detection rate and falsethem. Here, “quick” detection does not mean a short negative rate of CT is 59.8 and 40.2 percent, respectively,duration, but rather a small number of observations. A much worse than that of SPOT, which are 94.7 andcompromised machine can send spam messages at a low 5.3 percent, respectively. We also note that all therate (which, though, works against the interest of spam- compromised IP addresses detected (confirmed) using CTmers), but it can still be detected once enough observations or PT are also detected (confirmed) using the SPOTare obtained by SPOT. detection algorithm. That is, the IP addresses detected (confirmed) using CT and PT are a subset of compromised6.3 Performance of CT and PT IP addresses detected (confirmed) using the SPOT detectionIn this section, we evaluate the performance of CT and PT algorithm. The IP addresses associated with compromisedand compare their performance with that of SPOT, using machines that are missed by SPOT are also missed by CTthe same two-month e-mail trace collected on the FSU and PT. We conclude that SPOT outperforms both CT andcampus network. Recall that CT is a detection algorithm PT in terms of both detection rate and miss rate.based on the number of spam messages originated or In the table, we also show the performance of the simpleforwarded by an internal machine, and PT based on the spam zombie detection algorithm. We first note that thepercentage of spam messages originated or forwarded by methods to confirm a detection or identify a missed
  42. 42. DUAN ET AL.: DETECTING SPAM ZOMBIES BY MONITORING OUTGOING MESSAGES 207Fig. 5. Distribution of spam messages in each cluster. Fig. 6. Distribution of total messages in each cluster.IP address are different from the ones used in SPOT, CT, SPOT outperforms both CT and PT, our discussion willand PT, due to the special property of the simple algorithm. focus on the impacts on SPOT; similar observations alsoFor this simple algorithm, we confirm the detection of a apply to CT and PT.compromised machine if the machine sent at least one In order to understand the potential impacts of dynamicoutgoing message carrying a virus/worm attachment. IP addresses on the detection algorithms, we groupSimilarly, if a machine sent out at least one outgoing messages from a dynamic IP address (with domain namesmessage carrying a virus/worm attachment, but the simple containing “wireless”) into clusters with a time intervalalgorithm fails to detect it as a compromised machine, then threshold of 30 minutes. Messages with a consecutivewe have identified a missed IP address. In this process, we interarrival time no greater than 30 minutes are groupedhave similarly excluded all the whitelisted mail-server into the same cluster. Given the short interarrival durationmachines. As we have discussed early that, the antivirus of messages within a cluster, we consider all the messagessoftware and SpamAssassin were two independent compo- from the same IP address within each cluster as being sentnents deployed at the FSU mail relay server, and a small from the same machine. That is, the correspondingnumber of messages carrying virus/worm attachments IP address has not been reassigned to a different machinewere not detected as spam by the spam filter. Due to the within the concerned cluster. (It is possible that messagesdifference in the methods of confirming a detection or from multiple adjacent clusters are actually sent from the http://ieeexploreprojects.blogspot.comidentifying a missed IP address, the four detection algo- same machine.)rithms observe different number of confirmed and missed Fig. 5 shows the cumulative distribution function (CDF)IP addresses. of the number of spam messages in each cluster. From the From the table we can see that, the simple detection figure, we can see that more than 90 percent of the clustersalgorithm can detect more machines (210) as being have no less than 10 spam messages, and more thancompromised than SPOT, CT, and PT. It also has better 96 percent no less than three spam messages. Given theperformance than CT and PT in terms of both detection rate large number of spam messages sent within each cluster, it(89.7 percent) and false negative rate (10.3 percent). The is unlikely for SPOT to mistake one compromised machinesimple algorithm has worse performance than the SPOT as another when it tries to detect spam zombies. Indeed, wealgorithm (see Table 4). To a degree, this result is surprising have manually checked that, spam messages tend to be sentin that the simple detection algorithm can have better back to back in a batch fashion when a dynamic IP addressperformance than both the CT and PT algorithms. How- is observed in the trace. Fig. 6 shows the CDF of the numberever, we caution that this observation could be caused by of all messages (including both spam and nonspam) in eachthe limitation of the FSU e-mails used in the study. Recall cluster. Similar observations can be made to that in Fig. 5.that this e-mail trace only contains outgoing messages Fig. 7 shows the CDF of the durations of the clusters. Asdestined to internal FSU accounts, and as we have we can see from the figure, more than 75 and 58 percent ofdiscussed, a higher portion of FSU internal IP addresses the clusters last no less than 30 minutes and 1 hoursent e-mails with a virus/worm attachment to FSU internalaccounts than the overall IP addresses observed, whichresults in a higher confirmation rate in the compromisedmachine detection. We expect the percentage of FSUmachines sending e-mails with a virus/worm attachmentwill drop if the e-mail trace contains all outgoing messages(instead of only the ones destined to internal FSU accounts),which will adversely affect the observed performance of thesimple detection algorithm.6.4 Dynamic IP AddressesIn this section, we conduct studies to understand thepotential impacts of dynamic IP addresses on theperformance of the three detection algorithms. Given that Fig. 7. Distribution of the cluster duration.
  43. 43. 208 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 2, MARCH/APRIL 2012 network that forwards the message. It terminates and identifies the originating machine when an IP address in the Received header field is not associated with a known mail server in the network. The similar practical deployment methods also apply to the CT and PT detection algorithms. 7.2 Possible Evasion Techniques Given that the developed compromised machine detection algorithms rely on (content-based) spam filters to classify messages into spam and nonspam, spammers may try to evade the detection algorithms by evading the deployed spam filters. They may send completely meaninglessFig. 8. Distribution of time intervals between clusters. “nonspam” messages (as classified by spam filters). How- ever, this will reduce the real spamming rate, and hence, the(corresponding to the two vertical lines in the figure), financial gains, of the spammers [4]. More importantly, asrespectively. The longest duration of a cluster we observe in shown in Fig. 2b, even if a spammer reduces the spamthe trace is about 3.5 hours. Fig. 8 shows the CDF of the time percentage to 50 percent, SPOT can still detect the spamintervals between consecutive clusters. As we can see from zombie with a relatively small number of observations (25the figure, the minimum time interval between two when ¼ 0:01,
  44. 44. ¼ 0:01, and ¼ 0:2). So, trying to send 0consecutive clusters is slightly more than 30 minutes nonspam messages will not help spammers to evade the(31.38 minutes), and the longest one is close to 13 days SPOT system.(18,649.38 minutes). Moreover, more than 88 percent of all Moreover, in certain environment where user feedback isintervals between clusters are longer than 1 hour. reliable, for example, feedback from users of the same Given the above observations, in particular, the large network in which SPOT is deployed, SPOT can rely onnumber of spam messages in each cluster, we conclude that classifications from end users (in addition to the spamdynamic IP addresses will not have any important impact on filter). Although completely meaningless messages maythe performance of SPOT. SPOT can reach a decision within evade the deployed spam filter, it is impossible for them tothe vast majority (96 percent) of the clusters in the setting we remain undetected by end users who receive such mes-used in the current performance study. It is unlikely for sages. User feedbacks may be incorporated into SPOT toSPOT to mistake a compromised machine as another. http://ieeexploreprojects.blogspot.comprevious section, trying to sendAs we improve the spam detection rate of the spam filter. have discussed in the spam7 DISCUSSION at a low rate will also not evade the SPOT system. SPOT relies on the number of (spam) messages, not the sendingIn this section, we discuss the practical deployment issues rate, to detect spam zombies.and possible techniques that spammers may employ to The current performance evaluation of SPOT was carriedevade the detection algorithms. Our discussions will focus out using the FSU e-mails collected in the year of 2005.on the SPOT detection algorithm. However, based on the above discussion, we expect that SPOT will work equally well in today’s environment.7.1 Practical Deployment Indeed, as long as the spam filer deployed together withTo ease exposition, we have assumed that a sending machine SPOT can provide a reasonable spam detection rate, them (Fig. 1) is an end-user client machine. It cannot be a mail SPOT system should be effective in identifying compro-relay server deployed by the network. In practice, a network mised machines. We also argue that a spam zombiemay have multiple subdomains and each has its own mail detection system such as SPOT is in an even greater need,servers. A message may be forwarded by a number of mail given that a large percentage of spam messages wererelay servers before leaving the network. SPOT can work originated from spamming bots in recent years. Variouswell in this kind of network environments. In the following, studies in recent years have shown that spam messages sentwe outline two possible approaches. First, SPOT can be from botnets accounted for above 80 percent of all spamdeployed at the mail servers in each subdomain to monitor messages on the Internet [11], [15]. For example, thethe outgoing messages so as to detect the compromised MessageLabs Intelligence 2010 annual security reportmachines in that subdomain. showed that approximately 88.2 percent of all spam in Second, and possibly more practically, SPOT is only 2010 were sent from botnets.deployed at the designated mail servers, which forward all As we have discussed in Section 5.3, selecting the “right”outgoing messages (or SPOT gets a replicated stream of all values for the parameters of CT and PT is much moreoutgoing messages), as discussed in Section 3. SPOT relies challenging and tricky than those of SPOT. In addition, theon the Received header fields to identify the originating parameters directly control the detection decision of the twomachine of a message in the network [13], [18]. Given that detection algorithms. For example, in CT, we specify thethe Received header fields can be spoofed by spammers maximum number of spam messages that a normal[19], SPOT should only use the Received header fields machine can send. Once the parameters are learned by theinserted by the known mail servers in the network. SPOT spammers, they can send spam messages below thecan determine the reliable Received header fields by configured threshold parameters to evade the detectionbacktracking from the last known mail server in the algorithms. One possible countermeasure is to configure the
  45. 45. DUAN ET AL.: DETECTING SPAM ZOMBIES BY MONITORING OUTGOING MESSAGES 209algorithms with small threshold values, which helps reduce G. Gu, R. Perdisci, J. Zhang, and W. Lee, “BotMiner: Clustering [7] Analysis of Network Traffic for Protocol- and Structure-Indepen-the spam sending rate of spammers from compromised dent Botnet Detection,” Proc. 17th USENIX Security Symp., Julymachines, and therefore, the financial gains of spammers. 2008.Spammers can also try to evade PT by sending meaningless [8] G. Gu, P. Porras, V. Yegneswaran, M. Fong, and W. Lee,“nonspam” messages. Similarly, user feedback can be used “BotHunter: Detecting Malware Infection through Ids-Driven Dialog Correlation,” Proc. 16th USENIX Security Symp., Aug. improve the spam detection rate of spam filters to defeat [9] G. Gu, J. Zhang, and W. Lee, “BotSniffer: Detecting Botnetthis type of evasions. Command and Control Channels in Network Traffic,” Proc. 15th Ann. Network and Distributed System Security Symp. (NDSS ’08), Feb. 2008.8 CONCLUSION [10] N. Ianelli and A. Hackworth, “Botnets as a Vehicle for Online Crime,” Proc. First Int’l Conf. Forensic Computer Science, 2006.In this paper, we developed an effective spam zombie [11] J.P. John, A. Moshchuk, S.D. Gribble, and A. Krishnamurthy,detection system named SPOT by monitoring outgoing “Studying Spamming Botnets Using Botlab,” Proc. Sixth Symp.messages in a network. SPOT was designed based on a [12] Networked Systems Design and Implementation (NSDI ’09), Apr. 2009. J. Jung, V. Paxson, A. Berger, and H. Balakrishnan, “Fast Portscansimple and powerful statistical tool named Sequential Detection Using Sequential Hypothesis Testing,” Proc. IEEE Symp.Probability Ratio Test to detect the compromised machines Security and Privacy, May 2004.that are involved in the spamming activities. SPOT has [13] J. Klensin, “Simple Mail Transfer Protocol,” IETF RFC 2821, Apr. 2001.bounded false positive and false negative error rates. It also [14] J. Markoff, “Russian Gang Hijacking PCs in Vast Scheme,” The Newminimizes the number of required observations to detect a York Times, zombie. Our evaluation studies based on a two-month 06hack.html, Aug. 2008.e-mail trace collected on the FSU campus network showed [15] P. Wood et al., “MessageLabs Intelligence: 2010 Annual Security Report,” 2010.that SPOT is an effective and efficient system in automati- [16] S. Radosavac, J.S. Baras, and I. Koutsopoulos, “A Framework forcally detecting compromised machines in a network. In MAC Protocol Misbehavior Detection in Wireless Networks,”addition, we also showed that SPOT outperforms two other Proc. Fourth ACM Workshop Wireless Security, Sept. 2005. [17] A. Ramachandran and N. Feamster, “Understanding the Net-detection algorithms based on the number and percentage of work-Level Behavior of Spammers,” Proc. ACM SIGCOMM,spam messages sent by an internal machine, respectively. pp. 291-302, Sept. 2006. [18] P. Resnick, “Internet Message Format,” IETF RFC 2822, Apr. 2001. [19] F. Sanchez, Z. Duan, and Y. Dong, “Understanding ForgeryACKNOWLEDGMENTS Properties of Spam Delivery Paths,” Proc. Seventh Ann. Collabora- tion, Electronic Messaging, Anti-Abuse and Spam Conf. (CEAS ’10),The authors thank the anonymous reviewers of IEEE July 2010.INFOCOM 2009 and the IEEE Transactions on Dependable [20] SpamAssassin, “The Apache SpamAssassin Project,” http:// 2011.and Secure Computing. Their insightful and constructive,comments helped improve both the technical details and [21] A. Wald, Sequential Analysis. John Wiley Sons, 1947. [22] G.B. Wetherill and K.D. Glazebrook, Sequential Methods inpresentation of the paper. Zhenhai Duan and Fernando Statistics. Chapman and Hall, 1986.Sanchez were supported in part by US National Science [23] M. Xie, H. Yin, and H. Wang, “An Effective Defense against EmailFoundation (NSF) Grant CNS-1041677. Yingfei Dong was Spam Laundering,” Proc. ACM Conf. Computer and Comm. Security,supported in part by NSF Grants CNS-1041739 and CNS- Oct./Nov. 2006.1018971. Any opinions, findings, and conclusions or [24] Y. Xie, Dynamic Achan, E. Gillum, M. Proc. ACM SIGCOMM, Aug. “How F. Xu, K. Are IP Addresses?” Goldszmidt, and T. Wobber,recommendations expressed in this paper are those of the 2007.authors and do not necessarily reflect the views of the NSF. [25] Y. Xie, F. Xu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov,A preliminary version of this paper appeared in the “Spamming Botnets: Signatures and Characteristics,” Proc. ACMProceedings of IEEE INFOCOM 2009 with the same title. SIGCOMM, Aug. 2008. [26] L. Zhuang, J. Dunagan, D.R. Simon, H.J. Wang, I. Osipkov, G.This project was carried out while Peng Chen was a Hulten, and J.D. Tygar, “Characterizing Botnets from Email Spamgraduate student at the Florida State University. This Records,” Proc. First Usenix Workshop Large-Scale Exploits andproject was carried out while James Michael Barker was Emergent Threats, Apr. 2008.with the Florida State University. Zhenhai Duan received the BS degree from Shandong University, China, in 1994, the MSREFERENCES degree from Beijing University, China, in 1997, and the PhD degree from the University of[1] P. Bacher, T. Holz, M. Kotter, and G. Wicherski, “Know Your Minnesota, in 2003, all in computer science. He Enemy: Tracking Botnets,” joined the faculty of the Department of Computer bots, 2011. Science at the Florida State University in 2003,[2] Z. Chen, C. Chen, and C. Ji, “Understanding Localized-Scanning where he is now an associate professor. His Worms,” Proc. IEEE Int’l Performance, Computing, and Comm. Conf. research interests include computer networks (IPCCC ’07), 2007. and network security. He was a corecipient of[3] R. Droms, “Dynamic Host Configuration Protocol,” IETF RFC Best Paper Award of the 2002 IEEE International Conference on 2131, Mar. 1997. Network Protocols (ICNP), the 2006 IEEE International Conference on[4] Z. Duan, Y. Dong, and K. Gopalan, “DMTP: Controlling Spam Computer Communications and Networks (ICCCN), and the IEEE through Message Delivery Differentiation,” Computer Networks, GLOBECOM 2008. He has served as a TPC cochair for CEAS 2011, the vol. 51, pp. 2616-2630, July 2007. IEEE GLOBECOM 2010 Next-Generation Networking Symposium, and[5] Z. Duan, K. Gopalan, and X. Yuan, “Behavioral Characteristics of the IEEE ICCCN 2007 and 2008 Network Algorithms and Performance Spammers and Their Network Reachability Properties,” Technical Evaluation Track. He is a member of the ACM and a senior member of Report TR-060602, Dept. of Computer Science, Florida State Univ., the IEEE. June 2006.[6] Z. Duan, K. Gopalan, and X. Yuan, “Behavioral Characteristics of Spammers and Their Network Reachability Properties,” Proc. IEEE Int’l Conf. Comm. (ICC ’07), June 2007.
  46. 46. 210 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 2, MARCH/APRIL 2012 Peng Chen received the BEng degree in Mary Stephenson received BS degrees in meteorology and in management information systems from Tianjin mathematics/computer science from the Florida State University in University, China, in 1999, the MEng degree in 1983 and the MS degree in meteorology from the University of computer science from Beihang University, Oklahoma in 1987. She is currently an associate director in Information China, in 2006, and the MS degree in Technology Services at Florida State University, within the Infrastructure computer science from the Florida State Uni- and Operations Services Group. She is currently responsible for the versity, in 2008. He is currently a software University’s enterprise Unix and Linux systems, storage, backup, and engineer at Juniper Networks, Sunnyvale, virtualization technologies and the University’s data center facility. California. His research interests include com- puter networks, Internet routing protocols, and James Michael Barker received the BA degree in philosophy from the networking security. College of William and Mary and the MA and PhD degrees in philosophy from the Florida State University. He is currently employed at the Fernando Sanchez received the BS degree University of North Carolina at Chapel Hill (UNC Chapel Hill) as an from the Universidad San Francisco de Quito, assistant vice chancellor and chief technology officer. He is responsible Ecuador, in 2002, and the MS degree from for the Infrastructure and Operations Division as well as the Commu- Florida State University in 2007, all in computer nication Technologies Division. His areas of operational oversight science. Currently, he is working toward the PhD include UNC Chapel Hill’s network, telephony, datacenters, messa- degree in the Department of Computer Science ging/communication services, operating systems hosting, identity at the Florida State University. His research management, middleware services, storage systems, and workload interests include network security with interest in automation. In this role, he is responsible for the core information networking protocols security and network ser- technology infrastructure services of UNC Chapel Hill’s “enterprise” as vices security. well as building block services for campus units. He previously served as the director of the University Computing Services at the Florida State University. His dominant research interests include Kant, the history of modern philosophy, and philosophy of technology. Yingfei Dong received the BS and MS degrees in computer science from the Harbin Institute of Technology, P.R. China, in 1989 and 1992, the . For more information on this or any other computing topic, doctor degree in engineering from Tsinghua please visit our Digital Library at University in 1996, and the PhD degree in computer and information science from the University of Minnesota in 2003. He is currently an associate professor in the Department of Electrical Engineering at the University of Hawaii at Manoa. His current research mostly focuseson computer networks, especially in network security, real-time net-works reliable communications, multimedia content delivery, Internetservices, and distributed systems. His work has been published in many http://ieeexploreprojects.blogspot.comreferred journals and conferences. He has served as both organizer andprogram committee member for many IEEE/ACM/IFIP conferences. Hiscurrent research is supported by US National Science Foundation. He isa member of the IEEE.