Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

714 728


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

714 728

  1. 1. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 A FAST POSITIVE APPROACH OF P-DPL IN THE PACKET INSPECTION 1 N.Kannaiya Raja, 2K.Arulanandam, 3M.Balaji (system calls, network access and files, and memory Abstract-The signature extraction process is based on a modifications) [1]–[3]. In static detection method is based oncomparison with a common function repository. By eliminatin information explicitly extracted or implicitly from thefunctions appearing in the common function repository from the executable source code. The main processing of static detectionsignature candidate list, P-DPL can minimize the risk of false-positive method is in providing rapid categorized. Since antivirusdetection errors. To minimize false-positive rates for P-DPL proposes vendors are handling every day an overcome amount of suspectintelligent candidate selection using entropy score to generate files for inspection [4], fast detection is essential. Static methodsignatures. Evaluation of P-DPL was conducted under various analysis solutions are mainly implemented using two methods:conditions. The findings suggest that the proposed method can be used signature-based and heuristic-based. Signature based methodsfor automatically generating signatures that are both specific, sensitive.In this paper we propose a new automatic mechanism, termed P-DPL trust on the finding the unique strings in the source code [4].for extracting signatures from malware files and unwanted mapping The algorithmic methods are based on procedure, which arefiles. Signatures generated by P-DPL are comprised of multiple byte- either determined by expert staff or by machine that specify astrings, which can be used by high-speed, network-based, malware malicious [5], [6]. As a case in point, Zhang et al. [7] in thefiltering devices. In order to minimize the risk of false positives (i.e., random forest data-mining algorithm to detect misuse anddetection of a malware signature in benign executable files), P-DPL abnormal network intrusions. The time period of time from theemploys a method for sanitizing executable file from chunks of code release of an unknown malware until securitythat originate from the underlying standard development platforms and software/hardware vendors update their client with the properreplicated in various instances of begins and malicious programs malware signature is extremely critical. At this time, thedeveloped by these platforms. In this method we have developed anew innovative form to find malicious data in the packet. We believe malware is undetectable by most signature-based solutions andthat P-DPL Another direction we intend to examine is the use of a is usually termed a zero attack. This malware can easily spreadmalware function library (MFL) in the signature generation process in and corrupt all machines, it is extremely essential to detect it asorder to further strengthen the signatures and minimize the risk of false soon as possible. So that signature-based solutions generate apositives. In addition, regular expressions defined by two or more suitable signature for block all threats. Defend organizationsdistinct signatures can be used in order to further minimize the risk of by prevent all type of malware. Carry through deep packetfalse positives. check all signatures for detecting and removing attacks such as malware spreads worms, denial-of-service, or distantKey words—Packet-Deployment payload (P-DPL), Automatic exploitation of vulnerabilities. Monitor network for preventsignature generation (ASG), malware, malware filtering. performance. Devices analysis the content of the packets. The I. INTRODUCTION process of generating unique signatures for malware filtering devices. Different methods are used for automatic signature n communication system are highly hyper sensitized to generation have been proposed in domain. The techniques I various types of attack. A parliamentary way of processing focusing on malware, worm, and where the signature isthese attacks is by means of malicious software, such as worms, extracted that the after the malware is executed in the course ofviruses, and Trojan horses. When it is spread, it can cause launching the attack. Different methods processed to extractsevere problems to all users, companies, and governments. signatures from full-fledged malware executables that mayNow the development in high-speed Internet connections gives contain a significant portion of code emanating froma higher level for creating and rapidly spread the new malware. development tools and platforms. In this research we find theSeveral techniques for detecting and deleting malware have problems, and evaluate an automatic signature generationbeen proposed. They are two types one is static and another one technique for P-DPL.dynamic. In dynamic detection method is based on informationcollected from the operating system at execution of the program1 N.Kannaiya Raja, M.E., (P.hd) .,A.P/CSE Dept. Arulmigu Meenakshi Amman College of Engg, Thiruvannamalai Dt, near Kanchipuram Dr. K.Arulanandam, Prof & Head, CSE Department Ganadipathy Tulsi’s Jain Engineering College, Vellore sakthsivamkva@gmail.com3 M. Balaji, M.E., Arulmigu Meenakshi Amman College of Engg,Thiruvannamalai Dt, near Kanchipuram. Fig 1 P-DPL creation and signature generating processes . P-DPL is created for multiple-string, signatures that can be used in intrusion detection systems for filtering malware. To improve its imprecision, P-DPL process and
  2. 2. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012complete and structured method, which are extracts the ordered sets of multiple tokens that match multiple variants ofmalware’s unique code from other segments of common and multiform worms. Honeycomb overlays parts of the flows inusually benign code, such as library files. When the sensitivity the traffic and uses a longest common substring (LCS)process ends, the remaining codes are the malicious code. Since algorithm to spot similarities in packet payloads. Subsequentlythe process is go on by generating a unique signature from the designed a double-honey pot system and introduced themalicious code, which can be used for removing the malware. position-aware distribution signatures (PADS) that areThe main objective of this research is creating signatures from computed from polymorphic worm samples and are composedmalware, spyware, Trojan horses, worms and viruses. The main of a byte frequency distribution instead of a fixed value forhypothesis is that in a superior step, suspected files are each position in the signature ―string.‖ Tang et al. [11]classified as benign or malicious by a human expert or by an use sequence alignment techniques, drawn from bio informatics,automated detection tool. This processing allows us to focus on to derive simplified regular aspect exploit-based signatures.the signature-generation process, but it also in the quality of the Exploit-based signatures can be generated quickly to detectsignatures on the accuracy of the of mistrustful files. P-DPL zero-attack exploits of uncovered vulnerabilities. However, lowwas used as the automatic signature generation (ASG) module damage on multiform malware. And also the signatures createdof the eDare (early detection, alert, and response) framework by the above techniques that are extracted and tested for short,[8] eDare is aimed at mitigating the spread of both known and worm, malware, the fact is that the malware, for exampleunknown malware in computer networks. eDare operates by viruses and Trojan horses, can be as large executable files, itfirst monitoring network traffic and filtering out known consist of full-fledged applications. These files usually containmalware using high-speed filtering devices that are a significant portion of different code segments that are spreadcontinuously updated with signatures generated by P-DPL by the software development platform spawning the malware..Next; unknown files are extracted from the remaining traffic For this case the large malware files, selecting a signature thatand examined using various machine-learning and temporal will be both sensitive and specific. Another limitation of thesereasoning methods in order to classify the files as malicious or techniques is that they focus on detecting malware after it hasbenign. P-DPL is implemented in the last step to extract been unleashed and try to generate a signature from the traffic itsignatures from newly detected malicious files. When eDare creates at the time attack is being processed. A payload-basedidentifies a new threat, P-DPL automatically produces a signature finding the malware code. In this paper falls into thesignature, and then, the filtering devices that are stationed on payload-based signature concept. Payload-based signaturethe network infrastructure are automatically updated. This generation methods are presented in [4]. At present a two-stepprocess is very fast, and also faster than when human statistical method for automatically extracting well, ―the best‖intervention, it is effective against zero attacks. The P-DPL signatures from the code of a malware. First of all programstechnique and a set of research that were performed on a on detached machines are intentionally affected with the virus.collection of malicious and benign executable. We were work The affected portion of the program are analyze with oneis in finding the length and selection of a signature among another to found that regions of the virus are constant from oneseveral candidates. instance to another. These regions are considered as signature candidates. The second phase estimates the probability that II. Related Works each of all candidate signatures will match a randomly chosen. The candidate with the lowest estimated false-positive is Since the signature must be general enough to capture selected as a signature. The Hancock system [4] was proposedas instances of the malware, Thus far sufficiently specific to for automatically extracting signatures for antivirus software.avoid over lapping with the content of normal traffic in order to Based on several heuristics, the Hancock system generates a setminimize false positives. The malware signatures can be of signature candidates, selecting the candidates that are notclassified as vulnerability-based, exploit-based and payload- likely to be found in benign code. Our approach, Hancockbased [9]. A vulnerability-based signature describes the relies on modeling benign code in order to minimize false-properties of a certain bug in the system that can be maliciously alarm risks. The Auto-Sign signature generator modeled bothexploited by the malware. Vulnerability-based signatures do benign and malicious code using byte 3-grams representation innot process to detect each every malicious code exploiting the order to select good signature candidates. Next, the signaturevulnerability; it is very effective when dealing with multiform candidates are ranked according to three different measures inmalware. Even though, a vulnerability-based signature can be order to select the best signature. Although, the Hancockgenerated only when the vulnerability is find. An exploit-based system and Auto-Sign differ from our approach, which issignature describes a sequence of commands triggered by the semantic aware in the sense that it does not rely on arbitrarymalware, which process exploits vulnerability in the system. byte code sequences, but the code representing internalExploit-based methods include Autograph, P-DPL sensor Net functions of the software. In addition, the methods presented inspy .which focus on analyzing similarities in packet payloads [4] and focus on generating signatures for antivirus software,belonging to network. These systems first identify abnormal the limitation of signature length is not necessarily considered.traffic originating from distrustful IP addresses, and then, Other solutions have been proposed for protecting systems andgenerate a signature by identifying most frequently occurring preventing an attack beforehand rather than detecting the attackbyte sequences. The Nemean architecture first clusters similar after it has been launched. This can be done by generatingsessions, and uses machine-learning techniques to generate signatures based on sequences of instructions that representsemantic-aware signatures for each cluster. Polygraph expands malicious or benign behavior. These sequences can be extractedthe notion of single substring signatures to be joined, and to either by statically analyzing the program after disassembly or
  3. 3. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012by monitoring the program during execution. For example, resilience of signatures to polymorphic malware variants.protecting a system from buffer overflow attacks can be Another common method for detecting polymorphic malware isachieved by: 1) creating signatures for legitimate instruction to incorporate semantics awareness into signatures. Forblocks and matching instruction sequences of monitored example, Christodorescu et al. proposed static semantics-awareprograms with the signature repository 2) using obfuscation of malware detection. They applied a matching algorithm on thepointers in such a way that a malicious application that tries to disassembled binaries to find the instruction sequences thatexploit a buffer overflow vulnerability will not be able to match the manually generated templates of maliciouscreate valid pointers or 3) by applying array and pointer behaviors, e.g., decryption loop. A framework for automaticboundary checking. As opposed to such methods, our goal in generation of intrusion signatures from honey net packet traces.this research is to generate signatures for high-speed traffic Nemean applied clustering techniques on connections andfiltering devices that do not rely on installation or modification sessions to create protocol-semantic-aware signatures, therebyof end points and that will protect the end points at the network reducing the possibility of false alarms.level. In summation, each of the aforementioned techniques Another loosely related area is the automaticsuffers from at least one critical limitation. Some rely on small generation of attack signatures, vulnerability signatures andand coherent malware files, but such files may not constitute software patches. TaintCheck [12] and Vigilante [13] appliedthe general case. Other techniques rely on observing malware taint analysis to track the propagation of network inputs to databehavior, but such malware cannot always be fully monitored. used in attacks, e.g., jump addresses, format strings and systemOther methods search for packet similar, not assure true low call arguments, which are used to create signatures for thefalse positive. Our method disregards the malware size attacks. Other heuristic-based roaches [14] have also beenassumption. In addition, it does not require activating the proposed to exploit properties of specific exploits (e.g., buffermalware. overflow) and create attack signatures. Generalizing from these Modern anti-virus software typically employ a variety approaches, Brumley et al. proposed a systematic method thatof methods to detect malware programs, such as signature- used a formal model to reason about vulnerability signaturesbased scanning, heuristic-based detection, and behavioral and quantify the signature qualities. An alternative approach todetection [10]. Although less proactive, signature-based preventing malware from exploiting vunerabilities is to applymalware scanning is still the most prevalent approach to data patches in the firewalls to filter malicious traffic. Toidentify malware because of its efficiency and low false automatically generate data patches. Which leveraged thepositive rate. Traditionally, the malware signatures are created knowledge of data format of malicious attacks to generatemanually, which is both slow and error-prone. As a result, potential attack instances and then created signatures from theefficient generation of malware signatures has become a major instances that successfully exploit the vulnerabilities?challenge for anti-virus companies to handle the exponential Hancock differs from previous work by focusing ongrowth of unique malware files. To solve this problem, several automatically generating high-coverage string signatures withautomatic signature generation approaches have been proposed. extremely low false positives. Our research was based loosely Most previous work focused on creating signatures on the virus signature extraction, which was commercially usedthat are used by Network Intrusion Detection Systems (NIDS) by IBM. They used a 5-gram Markov chain model of goodto detect network worms. Singh et al. Proposed EarlyBird [11], software to estimate the probability that a given byte sequencewhich used packet content prevalence and address dispersion to would show up in good software. They tested hand-generatedautomatically generate worm signatures from the invariant signatures and found that it was quite easy to set a modelportions of worm payloads. Autograph exploited a similar idea probability threshold with a zero false positive rate and ato create worm signatures by dividing each suspicious network modest false negative rate (the fraction of rejected signaturesflow into blocks terminated by some breakmark and then that would not be found in goodware) of 48%. They alsoanalyzing the prevalence of each content block. The suspicious generated signatures from assembly code (as Hancock does),flows are selected by a port-scanning flow classifier to reduce rather than data, and identified candidate signatures by runningfalse positives. Kreibich and Crowcroft developed Honeycomb, the malware in a test environment. Hancock does not do this, asa system that uses honeypots to gather inherently suspicious dynamic analysis is very slow in large-scale applications.traffic and generates signatured by applying the longest Symantec acquired this technology from IBM in the mid-90scommon sub string (LCS) algorithm to search for similarities in and found that it led to many false positives. The Symantecthe packet payloads. One potential drawback of signatures engineers believed that it worked well for IBM because IBM’sgenerated from previous approaches is that they are all anti-virus technology was used mainly in corporatecontinuous strings and may fail to match polymorphic worm environments, making it much easier for IBM to collect apayloads. Polygraph instead searched for invariant content in representative set of goodware. By contrast, signaturesthe network flows and created signatures consisting of multiple generated by Hancock are mainly for home users, who have adisjoint content sub strings. Polygraph also utilized a naive much broader set of goodware. The model’s training set cannotBayes classifier to allow the probabilistic matching and possibly contain, or even represent, all of this goodware. Thisclassification, and thus provided better proactive detection poses a significant challenge for Hancock in avoiding FP-pronecapabilities. A system that used a model-based algorithm to signatures.analyze the invariant contents of polymorphic worms andanalytically prove the attack-resilience of generated signatures. III. Payload Based Anomaly DetectionPDAS (Position-Aware Distribution Signatures) took advantageof a statistical anomaly-based approach to improve the A. Overview of the P-DPL Sensor:
  4. 4. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 The P-DPL sensor is based on the principle that zero- To compare the similarity between test data at detection timeday attacks are delivered in packets whose data is unusual and and the trained models computed during the training period, P-distinct from all prior ―normal content‖ flowing to or from the DPL uses simplified Mahalanobis distance [18]. Mahalanobisvictim site. We assume that the packet content is available to distance. To compare the similarity between test data at timethe sensor for modeling3. We compute a normal profile of a and the trained models computed during the training period, P-site’s unique content flow, and use this information to detect DPL uses simplified Mahalanobis distance [18]. Mahalanobisanomalous data. A ―profile‖ is a model or a set of models that distance weights each variable, the mean frequency of a 1-represent the set of data seen during training. Since we are gram, by its standard deviation and covariance. The distanceprofiling content data flows, the method must be general to values produced by the models are then subjected to a thresholdwork across all sites and all services, and it must be efficient test. If the distance of a test datum is greater than the threshold,and accurate.Our initial design of P-DPL uses a ―language P-DPL issues an alert for the packet. There is a distinctindependent‖ methodology, the statistical distribution of n- threshold setting for each centroid computed automatically bygrams [15] extracted from network packet datagrams. This P-DPL during a calibration step.methodology requires no parsing, no interpretation and noemulation of the content. An n-gram is the sequence of n adjacent byte values ina packet payload. A sliding window with width n is passed overthe whole payload one byte at a time and the frequency of eachn-gram is computed. This frequency count distributionrepresents a statistical centroid or model of the content flow.The normalized average frequency and the variance of eachgram are computed. The first implementation of P-DPL usesthe byte value distribution when n=1. The statistical means andvariances of the 1- grams are stored in two 256-elementvectors. However, we condition a distinct model on the port (orservice) and on packet length, producing a set of statisticalcentroids that in total provides a fine-grained, compact andeffective model of a site’s actual content flow. Full details ofthis method and its effectiveness are described in [18]. The first packet of CRII illustrates the 1-gram datarepresentation implemented in P-DPL. Figure 1 shows a portion Fig. 3. CRII payload distribution (top plot) and its rankof the CRII packet, and its computed byte value distribution order distribution (bottom plot)along with the rank ordered distribution is displayed in Figure2, from which we extract a Z-string. The Z-string is a the string To calibrate the sensor, a sample of test data is measuredof distinct bytes whose frequency in the data is ordered from against the centroids and an initial threshold setting is chosen.most frequent to least, serving as representative of the entire A subsequent round of testing of new data updates thedistribution, ignoring those byte values that do not appear in the threshold settings to calibrate the sensor to the operatingdata. The rank ordered distribution appears similar to the Zipf environment. Once this step converges, P-DPL is ready to enterdistribution, and hence the name Z-string. The Z-string detection mode. Although the very initial results of testing P-representation provides a privacy-preserving summary of DPL looked quite promising, we devised several improvementspayload that may be exchanged between domains without to the modeling technique to reduce the percentage of falserevealing the true content. Z-strings are not used for detection, positives.but rather for message exchange and cross domain correlationof alerts. B. New P-DPL Features Multiple Centroids P-DPL is a fully automatic, ―hands-free‖ online GET./default.ida?XXXXXXXXXXX x anomaly detection sensor. It trains models and determines when XXXXXXXXXXXXXXXXXXXXXX they are stable; it is self-calibrating, automatically observes XXXXXXXXXXXXXXXXXXXXXX itself, and updates its models as warranted. The most important XXXXXXXXXXXXXXXXXXXXXX new feature implemented in P-DPL over our prior work is the XXXXXXXXXXXXXXXXXXXXXX use of multiple centroids, and ingress/egress correlation. In the XXXXXXXXXXXXXXXXXXXXXX first implementation, P-DPL computes one centroid per length bin, followed by a stage of clustering similar centroids across XXXXXXXXXXXXXXXXXXXXXX neighboring bins. We previously computed a model Mij for XXXXXX%u9090%u6858%ucbd3%u7 each specific observed packet payload length i of each port j. In 801%u9090%u6858%ucbd3%u7801%u this newer version, we compute a set of models Mkij , k≥1. 9090%u6858%ucbd3%u7801%u9090% Hence, within each length bin, multiple models are computed u9090%u8190%u00c3%u0003%u8b000 prior to a final clustering stage. The clustering is now executed %u531b%u53ff%u0078%u0000%u0 u0 across centroids within a length bin, and then memory requirements for models while representing normal content Fig. 2. A portion of the first packet of CodeRed II flow more accurately and revealing anomalous data with
  5. 5. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012greater clarity. Since there might be different types of payload the byte distributions among the sites for the same length bin.sent to the same service, e.g., pure text, .pdf, or .jpg, we used This is confirmed by the values of Manhattan distancesan incremental online clustering algorithm to create multiple computed between the distributions, with results displayed incentroids to model the traffic with finer granularity. This Table 1.modeling idea can be extended to include centroids for different The content traffic among the sites is quite that may be transmitted in packet flows. Different file For example, the EX dataset is more complex containing fileand media types follow their own characteristic 1-gram uploads of different media types (pdf, jpg, ppt, etc. ) anddistribution; including models for standard file types can help webmail traffic; the W dataset contain less of this type of trafficreduce false positives. The multi-centroid strategy requires a while W1 is the simplest, containing almost no file uploads.different test methodology. During testing,an alert will be Hence, each of the site-specific payload models is diverse,generated by P- DPL if a test packet matches none of the increasing the likelihood that a worm payload will be detectedcentroids within its length bin. The multicentroid technique by at least one of these sites. To avoid detection, the wormproduces more accurate payload models and separates the exploit would have to be padded in such a way that its contentanomalous payloads in a more precise manner. description would appear to be normal concurrently for all of these sites. C. Data Diversity across Sites A crucial issue we study is whether or not payloadmodels are truly distinct across multiple sites. This is animportant question in a collaborative security context. We haveclaimed that the monoculture problem applies not only tocommon services and applications, but also to securitytechnologies. Hence, if a site is blind to a zero-day attack thisimplies that many other sites are blind to the same attack.Researchers are considering solutions to the monocultureproblem by various techniques that ―diversify‖ implementations. We conjecture thatthe content data flow among different sites is already diverseeven when running the exact same services. In our previouswork we have shown that byte distributions differ for each portand length. We also conjecture that it should be different foreach host. For example, each web server contains different ASCII characters 0-255URLs, implements different functionality like web email ormedia uploads, and the population of service requests and Fig. 4. Example byte distribution for Payload length 249 ofresponses sent to and from each site may differ, producing a port 80 for the three sites EX, W, W1, in order from top to threediverse set of content Profiles across all collaborating hosts sites EX, W, W1 bottomand sites. Hence, each host or site’s profile will besubstantially different from all others. Azero-day attack that may appear as normal data at one site, will EX-data Length 1380likely not appear as normal data at other sites since the normalprofiles are different. We test whether or not this conjecture istrue by several experiments. One of the most difficult aspects of doing research inthis area is the lack of real- world datasets available toresearchers that have full packet content for formal scientificstudy4. Privacy policies typically prevent sites from sharingtheir content data. However, we were able to use data fromthree sources, and show the distribution or each. The first one isan external commercial organization that wishes to remainanonymous, which we call EX. The others are the two web W1-data Length 1380servers of the CS Department of Columbia, and call these two data sets W and W1, respectively. Thefollowing plots show the profiles of the traffic content flow ofeach site. The plots display the payload distributions for ASCII characters 0-255different packet payload lengths i.e. 249 bytes and 1380 bytes,spanning the whole range of possible payload lengths in orderto give a general view of the diversity of the data coming from Fig. 5. Example byte distribution for payload length of 1380 ofthe three sites. Each byte distribution corresponds to the first port 80 for thecentroid that is built for the respective payload lengths. Weobserve from the above plots that there is a visible difference in Table 1. The Manhattan distance between the byte distributions
  6. 6. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012of the profiles computed for the three sites, for three length first created a clean set of packets free of any known wormsbins. still flowing on the Internet as background radiation. We then inserted the same set of worm traffic into the cleaned test set using tcpslice. Thus, we created ground truth in order to 249 bytes 940 bytes 1380 bytes compute the accuracy and false positive rates. The worm set includes CodeRed, CodeRed II, MD(EX, W) 0.4841 0.6723 0.2533 WebDAV, and a worm that exploits the IIS Windows media MD(EX,W1) 0.3710 0.8120 0.4962 service, the nsiislog.dll buffer overflow vulnerability (MS03- 022). These worm samples were collected from real traffic as MD(W,W1) 0.3689 0.5972 0.6116 they appeared in the wild, from both our own dataset and from a third-party. Because P-DPL only considers the packet payload, the worm set is inserted at random places in the testMimicry attacks are possible if the attacker has access to the data. The ROC plots in Figure 5 show the result of the detectionsame information as the victim. In the case of application rate versus false positive rate over varying threshold settings ofpayloads, attackers (including worms) would not know the the P-DPL sensor.distribution of the normal flow to their intended victim. Theattacker would need to sniff each site for a long period of timeand analyze the traffic in the same fashion as the detectordescribed herein, and would also then need to figure out how to All worms reliably detectedpad their poison payload to mimic the normal model. This is adaunting task for the attacker who would have to be cleverindeed to guess the exact distribution as well as the thresholdlogic to deliver attack data that would go unnoticed.Additionally, any attempt to do this via probing, crawling orother means is very likely to be detected. Besides mimicry attack, clever worm writers mayfigure a way to launch training attacks’ against anomalydetectors such as P-DPL. In this case, the worm may send astream of content with increasing diversity to its next victimsite in order to train the content sensor to produce modelswhere its exploit no longer would appear anomalous. This aswell is a daunting task for the worm. The worm would befortunate indeed to launch its training attack when the sensor is False Positive Rate(%)in training mode and that a stream of diverse data would go Fig. 6 ROC of P-DPL detecting incoming worms, false positiveunnoticed while the sensor is in detection mode. Furthermore, rate restricted to less than 0.5%the worm would have to be extremely lucky that each of thecontent examples it sends to train the sensor would produce a The detection rate and false positive are both based on"non-error" response from the intended victim. Indeed, P-DPL the number of packets. The test set contains 40 worm packetsignores content that does not produce a normal service although there are only 4 actual worms in our zoo. The plotsresponse. These two evasion techniques, mimicry and training show the results for each data set, where each graphed line isattack, is part of our ongoing research on anomaly detection, the detection rate of the sensor where all 4 worms wereand a formal treatment of the range of "counter-evasion" detected. (This means more than half of each the worm’sstrategies we are developing is beyond the scope of this paper. packets were detected as anomalous content.) From the plot we can see that although the three sites are quite different in D. Worm Detection Evaluation payload distribution, P-DPL can successfully detect all the worms at a very low false positive rate. To provide a concrete In this section, we provide experimental evidence of the example we measured the average false alerts per hour for theseeffectiveness of P-DPL to detect incoming worms. In our three sites. For 0.1% false positive rate, the EX dataset has 5.8previous RAID paper [18], we showed P-DPL’s accuracy for alerts per hour, W1 has 6 alerts per hour and W has 8 alerts perthe DARPA99 dataset, which contains a lot of artifacts that hour.make the data too regular [16]. Here we report how P-DPL We manually checked the packets that were deemedperforms over the three real-world datasets using known worms false positives. Indeed, most of these are actually quiteavailable for our research. Since all three datasets were anomalous containing very odd abnormal payload. Forcaptured from real traffic, there is no ground truth, and example, in the EX dataset, there are weird file uploads, in onemeasuring accuracy was not immediately possible. We thus case a whole packet containing nothing but a repetition of aneeded to create test sets with ground truth, and we applied character with byte value E7 as part of a word file. OtherSnort for this purpose. packets included unusual HTTP Get requests, with the referrer Each dataset was split into two distinct chrono field padded with many ―Y‖ characters via product providinglogically-ordered portions, one for training and the other for anonymization.testing, following the 80%-20% rule. For each test dataset, we We note that some worms might fragment their
  7. 7. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012content into a series of tiny packets to evade detection. For this comparison is performed against the packet contents and aproblem, P-DPL buffers and concatenates very small packets of string similarity score is computed. If the score is higher thana session prior to testing. some threshold, we treat this as possible worm propagation and We also tested the detection rate of the W32.Blaster block or delay this outgoing traffic. This is different from theworm (MS03-026) on TCP port 135 port using real RPC traffic common quarantining or containment approaches which blockinside Columbia’s CS department. Despite being much more all the traffic to or from some machine. P-DPL will only blockregular compared to HTTP traffic, the worm packets in each traffic whose content is deemed very suspicious, while all othercase were easily detected with zero false positives. Although at traffic may precede unabated maintaining critical services.first blush, 5-8 alerts per hour may seem too high, a key There are many possible metrics which can apply tocontribution of this paper is a method to correlate multiple decide the similarity of two strings. The several approaches wealerts to extract from the stream of alerts true worm events. have considered, tested and evaluated include: 1. String equality (SE) IV. Worm Propagation Detection and signature This is the most intuitive approach. We decide that Generation by Correlation propagation has started only if the egress payload is exactly the same as the ingress suspect packet. This In the previous section, we described the results using metric is very strict and good at reducing falseP-DPL to detect anomalous packet content. We extended the positives, but too sensitive to any tiny change in thedetection strategy to model both inbound and outbound traffic packet payload. If the worm changes a single byte orfrom a protected host, computing models of content flows for just changes its packet fragmentation, the anomalousingress and egress packets. The strategy thus implies that packet correlation will miss the propagation attempt.within a protected LAN, some infected internal host will begin (The same is true when comparing thumbprints ofa propagation sending outbound anomalous packets. When this content.)occurs for any host in the LAN, we wish to inoculate all other 2. Longest common substring (LCS)hosts by generating and distributing worm packet signatures to The next metric we considered is the LCS approach.other hosts for content filtering. LCS is less exact than SE, but avoids the We leverage the fact that self-propagating worms will fragmentation problem and other small payloadstart attacking other machines automatically by replicating manipulations. The longer the LCS that is computeditself, or at least the exploit portion of its content, shortly after a between two packets, the greater the confidence thathost is infected. (Polymorphic worms may randomly pad their the suspect anomalous ingress/egress packets are morecontent, but the exploit should remain intact.) Thus if we detect similar. The main shortcoming of this approach is itsthese anomalous egress packets to port i that are very similar to computation overhead compared to string equality,those anomalous ingress traffic to port i, there is a high although it can also be implemented in linear time.probability that a worm that exploits the service at port i has 3. Longest common subsequence (LCSeq):started its propagation. Note that these are the very first packets This is similar to LCS, but the longest commonof the propagation, unlike the other approaches which have to subsequence need not be contiguous. LCSeq has thewait until the host has already shown substantial amounts of advantage of being able to detect polymorphic worms,unusual scanning and probing behavior. Thus, the worm may but it may introduce more false positives. For eachbe stopped at its very first propagation attempt from the first pair of strings that are compared, we compute avictim even if the worm attempts to be slow and stealthy to similarity score, the higher the score, the more similaravoid detection by probe detectors. We describe the the strings are to each other. For SE, the score is 0 oringress/egress correlation strategy in the following section. We 1, where 1 means equality. For both LCS and LCSeq,note, however, that the same strategy can be applied to ingress we use the percentage of the LCS or LCSeq length outpackets flowing from arbitrary (external) sources to internal of the total length of the candidate strings. Let’s saytarget IPs. Hence, ingress/ingress anomalous packet correlation string s1 has length L1, and string s2 has length L2,may be viewed as a special case of this strategy. and their LCS/LCSeq has length C. We compute the Careful treatment of port-forwarding protocols and similarity score as 2*C/( L1+ L2). This normalizes theservices, such as P2P and NTP (Port 123) is required to apply score in the range of [0...1], where 1 means the stringsthis correlation strategy, otherwise normal port forwarding may are exactly misinterpreted as worm propagations. Our work in this areainvolves two strategies, truncation of packets (focusing on Since we may have to check each outgoing packet (tocontrol data) and modeling of the content of media. This work port i) against possibly many suspect strings inbound to port i,is beyond the scope of this paper due to space limitations, and we need to concern ourselves with the computational costs andwill be addressed in a future paper. storage required for such a strategy. On a real server machine, e.g., a web server, there are large numbers of incoming requests A. Ingress and Egress Traffic Correlation but very few, if any, outgoing requests to port 80 from the server (to other servers). So any outgoing request is already When P-DPL detects some incoming anomalous quite suspicious, and we should compare each of them againsttraffic to port i, it generates an alert and places the packet the suspects. If the host machine is used as both a server and acontent on a buffer list of ―suspects‖. Any outbound traffic to client simultaneously, then both incoming and outgoingport i that is deemed anomalous is compared to the buffer. The requests may occur frequently. This is mitigated somewhat by
  8. 8. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012the fact that we check only packets deemed anomalous, not Different fragmentation for CR and CRIIevery possible packet flowing to and from a machine. We applythe same modeling technique to the outgoing traffic and onlycompare the egress traffic we already labeled as anomalous. Incoming Outgoing B. Automatic Worm Signature Generation 1448, 1448, 1143 4, 13, 362, 91, 1460, 1460, 649 There is another very important benefit that accruesfrom the ingress/egress packet content correlation and string 4, 375, 1460, 1460, 740similarity comparison: automatic worm signature generation. 4, 13, 453, 1460, 1460,The computation of the similarity score produces the matching 649sub-string or subsequence which represents the common part ofthe ingress and egress malicious traffic. This common Code Red II (total 3818 bytes)subsequence serves as a signature content-filter. Ideally, a Incoming Outgoingworm signature should match worms and only worms. Sincethe traffic being compared is already judged as anomalous, and 1448, 1448, 922 1460, 1460, 898has exhibited propagation behavior quite different from normalbehavior – and the similar malicious payload is being sent to To evaluate the accuracy of worm propagationthe same service at other hosts, these common parts are very detection, we appended the propagation trace at the very end ofpossibly core exploit strings and hence can represent the worm one full day’s network data from each of the three sites. Whensignature. By using LCSeq, we may capture even polymorphic we collected the trace from our attack network, we not onlyworms since the core exploit usually remains the same within captured the incoming port 80 requests, but also all theeach worm instance even though it may be reordered within the outgoing traffic directed to port 80. We checked each datasetpacket datagram. Thus, by correlating the ingress and egress manually, and found there is a small number of outgoingmalicious payload, we are able to detect the very initial worm packets for the servers that produced the datasets W and W1, aspropagation, and compute its signature immediately. Further, if we expected, and not a single one for the EX dataset. Hence,we distribute these strings to collaborating sites, they too can any egress packets to port 80 would be obviously anomalousleverage the added benefit of corroborating suspects they may without having to inspect their content. For this experiment, wehave detected, and they may choose to employ content filters, captured all suspect incoming anomalous payloads in anpreventing them from being exploited by a new and zero-day unlimited sized buffer for comparison across all of the availableworm. data in our test sets. We also purposely lowered P-DPL’s threshold setting (after calibration) in order to generate a very V. Evaluations high number of suspects in order to test the accuracy of the string comparison and packet correlation strategies. In other In this section, we evaluate the performance of words, we increased the noise (increasing the number of falseingress/egress correlation and the quality of the automatically positives) in order to determine how well the correlation cangenerated signatures. Since none of the machines were attacked still separate out the important signal in the traffic (the actualby worms during our data collection time at the three sites, we worm content).launched real worms to un-patched Windows 2000 machines in The result of this experiment is displayed in thea controlled environment. For testing purposes, the packet following table for the different similarity metrics. The numbertraces of the worm propagation were merged into the three in the parenthesis is the threshold used for the similarity score.sites’ packet flows as if the worm infection actually happened For an outgoing packet, P-DPL checks the suspect buffer andat each site. Since P-DPL only uses payload, the source and returns the highest similarity score. If the score is higher thantarget IP addresses of the merged content are irrelevant. the threshold, we judge there is a worm propagation. FalseWithout a complete collection of worms, and with limited alerts suggest that an alert was mistakenly generated for acapability to attack machines, we only tested CodeRed and normal outgoing packet. The reason why SE does not workCodeRed II out of the executable worms we collected. After here is obvious: worm fragmentation blinds the method fromlaunching these in our test environment and capturing the seeing the worm’s entire matching content. The other twopacket flow trace, we noticed interesting behavior: after metrics worked perfectly, detecting all the worm propagationsinfection, these two worms propagate with packets fragmented with zero false alerts.differently than the ones that initially infected the host. Inparticular, CodeRed can separate ―GET.‖ and Results of correlation for different metrics―/default.ida?‖and ―NNN...N‖ into different packets to avoid detection bymany signature-based IDSes. The following table shows the Detect propagate False alertslength sequences of different packet fragmentation forCodeRed and CodeRed II. SE Yes No LCS(0.5) Yes No LCSeq(0.5) Yes No
  9. 9. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012To use some other traffic to simulate the outgoing traffic of the time, cross-site collaboration and exchange of suspect packetservers. For EX data, we used the outgoing port 80 traffic of payloads might provide a solution. We discuss this in the nextother clients in that enterprise as if it originated from the EX section.server itself. For the W1 and W datasets, we used the outgoing VI. Anomalous Payload Collaboration among Sitesport 80 traffic from the CS department. Then we repeated the Most current attack detection systems are constrainedprevious experiments to detect The worm propagation with the to a single ingress point within an enterprise without sharinginjected outgoing traffic on each server. The result remains the any information with other sites. There are ongoing efforts thatsame - using the same thresholds as before, we can successfully share suspicious source IP address [5, 10], but to ourdetect all the worm propagations without any false alerts. knowledge no such effort exists to share content information across sites in real time until now. Here we focus on evaluating As we mentioned earlier, the worm signature is a the detection accuracy of using collaboration among sites,natural byproduct of the ingress/egress correlation. When we assuming a scaleable, privacy-preserving securedidentified a possible worm propagation, the LCS or LCseq can communication infrastructure is available. (We havebe used as the worm signature. Figure 6 displays the actual implemented a prototype in Worminator [17].)content signatures computed for the CR II propagations Recall that, in section3.4, we described experimentsdetected by P-DPL in a style suitable for deployment in Snort. measuring the diversity of the models computed at multipleNote the signature contains some of the ystem calls used to sites. As we saw, the different sites tested have different normalinfect a host, which is one of the reasons the false positive rate payload models. This implies from a statistical perspective thatis so low for these detailed signatures. they should also have different false positive alerts. Any ―common or highly similar anomalous payloads‖ detected among two or more sites logically would be |d0|$@|0 ff|5|d0|$@|0|h|d0| @|0|j|1|j|0|U|ff||d0| @| U|f caused by a common worm exploit targeting many sites. Cross- 5|d8|$@|0 e8 19 0 0 0 c3 ff|%`0@|0 ff|%d0@|0 ff|%` site or cross- domain sharing may thus reduce the false positive ff|%h0@|0 ff|%p0@|0 ff|%t0@|0 ff|%x0@|0 ff|%| ff| problem at each site, and may more accurately identify worm 0@|fc fc fc fc fc fc fc fc fc fc fc fc fc fc LORER.EXE outbreaks in the earliest stages of an infection. fc fc fc fc fc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0|EXP To test this idea, we used the traffic from the three |0 0 0 0 0 0 0 0|SOFTWAREMicrosoftWindows NT sites. There are two goals we seek to achieve in this CurrentVersionWinlogon|0 0 0 0 0|SFCDisable|0 0 experiment. One is to test whether different sites can help 9 9 9d ffff ff ff|SYSTEMCurrentControlSetService confirm with each other that a worm is spreading and attacking sW3SVCParametersVirtual Roots|0 0 0 00 0 0|/Scr the Internet. The other is to test whether false alerts can be ipts|0 0 0 0|/MSADC| 0 00 0|/C|0 0 0|/D|0 00|c:,,21 reduced, or even eliminated at each site when content alerts are 7|0 0 0 0 00 0|d:,,217|fc fc f cfc f cfc fc fc fc fc fc fc correlated. In this experiment, we used the following simpleFig. 7.The initial portion of the P-DPL generated signature for correlation rule: if two alerts from distinct sites are similar, theCodeRed II. two alerts are considered true worm attacks; otherwise they are ignored. Each site’s content alerts act as confirmatory evidence We replicated the above experiments in order to test if of a new worm outbreak, even after two such initial alerts areany normal packet is blocked when we filter the real traffic generated. This is very strict, aiming for the optimal solution toagainst all the worm signatures generated. For our experiments the worm problem.we used the datasets from all the three sites, which have had the This is a key observation. The optimal result we seekCRII attacks cleaned beforehand, and in all cases no normal is that for any payload alerts generated from the same wormpacket was blocked.In these experiments, we used an unlimited launched at two ore more sites, those payloads should bebuffer for the incoming suspect payloads. The buffer size similar to each other, but not for normal data from either siteessentially stores packets for some period of time that is that was a false positive. That is to say, if a site generates adependent upon the traffic rate, and the number of anomalous false positive alert about normal traffic it has seen, it will notpacket alerts that are generated from that traffic. That amount is produce suspect payloads that any other site will deem to beindeterminate a priori, and is specific to both the environment worm propagation. Since we conjectured that each site’sbeing sniffed and the quality of the models computed by P-DPL content models are diverse and highly distinct, even the falsefor that environment. Since CR and CR II launch their positives each site may generate will not match the falsepropagations immediately after infecting their victim hosts, a positives of other sites; only worms (i.e., true positives) will bebuffer holding only the most recent 5 or 10 suspects is enough commonly matched as anomalous data among multiple detect their propagation. But for slow-propagating or stealthy To make the experiment more convincing, we noworms which might start propagating after an arbitrarily long longer test the same worm traffic against each site as in thehibernation period, the question is how many suspects should previous section, since the sensor will obviously generate thewe save in the suspect buffer? If the ingress anomalous exact same payload alert at all the sites. Instead, we usepayloads have been removed from the suspect buffer before multiple variants of CodeRed and CodeRed II, which weresuch a worm starts propagating, P-DPL can no longer detect it extracted from real traffic. To make the evaluation strict, weby correlation. Theoretically, the larger the buffer the better, tested different packet payloads for the same worm, and all thebut there is tradeoff in memory usage and computation time. variant packet fragments it generates. We purposely loweredBut for those worms that may hibernate for a long period of the P-DPL threshold to generate many more false positives
  10. 10. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012from each site than it otherwise would produce. As in the case computed for false P-DPL alerts. The x-axis shows thedescribed above the cross-site correlation uses the same metrics similarity score, defined within the range [0...1], and the y-axis(SE, LCS and LCSeq) to judge whether two payload alerts are is the number of pairs of alerts within the same score range.―similar‖. However, another problem that we need to consider The similarity scores for the worm alerts are shown separatelywhen we exchange information between sites is privacy. It may as dots on the x-axis. The worm alerts include those for CR andbe the case that a site is unwilling to allow packet content to be CR II and their variant fragments. Note that all of the scoresrevealed to some external collaborating site. a false positive calculated between worm alerts are much higher than those ofmay reveal true content. the ―false‖ P-DPL alerts and thus they would be correctly A packet payload could be presented by its 1-gram detected as true worms among collaborating sites. The alertsfrequency distribution (see Figure 2). This representation that scored too low would not have sufficient corroboration toalready aggregates the actual content byte values in a form deem them as true worms.making it nearly impossible (but not totally impossible) toreconstruct the actual payload. (Since byte value distributionsdo not contain sequential information, the actual content is hard CRII against CRIIto recover. 2-gram distributions simplify the problem making it CR against CRmore likely to recover the content since adjacent byte values Other Alertsare represented. 3-grams nearly make the problem trivial torecover the actual content in many cases.) However, we notethat the 1-gram frequency distribution reordered into the rankordered frequency distribution produces a distribution thatappears quite similar to the exponential decreasing Zipf-likedistribution. The rank ordering of the resultant distinct bytevalues is a string that we call the ―Z-string‖ (as discussed inSection 3.1). One cannot recover the actual content from the Z-String. Rather, only an aggregated representation of the bytevalue frequencies is revealed, without the actual frequencyinformation. This representation may convey sufficient Similarity Scores of Zstr Metricinformation to correlate suspect payloads, without revealing theactual payload itself. Hence, false positive content alerts wouldnot reveal true content, and privacy policies would be CRII against CRIImaintained among sites. CR against CR In this cross-domain correlation experiment we CRII against CRpropose two more metrics which don’t require exchanging raw Other Alertspayloads, but instead only the 1-gram distributions, and theprivacy-preserving Z-string representation of the payload: A. Manhattan distance (MD) Manhattan distance requires exchange of the byte distribution of the packet, which has 256 float numbers. Two payloads are similar if they have a small Manhattan distance. The maximum possible MD is 2. So we define the similarity score as (MD)/2, to normalize the score range to the same range of the other metrics described above. B. LCS of Z-string (Zstr) While maintaining maximal privacy preservation, we Similarity Scores of LCSeq Metric perform the LCS on the Z-string of two alerts. The similarity score is the same as the one for LCS, but Fig. 8. Similarity scores of Zstr and LCSeq metrics for here the score evaluates the similarity of two Z-strings, collaboration not the raw payload strings.Figure 7 presents the results achieved by sharing P-DPL alerts The above two plots show the similarity scores using Zstr andamong the three sites using CR and CR II and their variant LCSeq metrics. LCS produced a similar result to LCSeq. Stringpacket fragments. The results are shown in terms of the equality and Manhattan distance metrics did not perform wellsimilarity scores computed by each of the metrics. Each plot is in distinguishing true alerts from false ones, so their plots arecomposed of two different representations: one for false alerts not shown here. The other two metrics presented in Figure 5(histogram) and the other for worm alerts (dots on the x-axis). give particularly good results. The worms and their variantThe bars in the plots are histograms for the similarity scores packet fragments have much higher similarity scores than all
  11. 11. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012the other alerts generated at each distinct site. This provides only consider one single host, a stealthy worm can hibernate forsome evidence that this approach may work very well in a long period of time until a record of its appearance as anpractice and provide reliable information that a new zero-day anomaly is no longer stored in the buffer of suspect packets.attack is ongoing at different sites. However, in the context of collaborating sites, the suspect Note too that each site can contribute to false positive anomaly can be corroborated by some other site that may alsoreduction since the scores of the suspects are relatively low in have a record of it in their buffer, as a remote site may have acomparison to the true worms. larger buffer or may have received the worm at a different time. Furthermore, the Zstr metric shows the best separation The distributed sites essentially serve as a remote long-termhere, and with the added advantage of preserving the privacy of store of information, extending the local buffer memorythe exchanged content. These two metrics can also be applied available at one site. Further, this strategy concurrentlyto the ingress/egress traffic correlation, especially for generates content filtering signatures. Any two sites thatpolymorphic worms that might re-order their content. correlate and validate suspects as being true worms both have There are two interesting observations from this data. available the actual packet content from which to generate aThe circle in the LCSeq plot represents the similarity score signature, even if only Z-strings are exchanged between thosewhen exchanging the alerts among the sites that P-DPL sites.generated for CR and CR II. LCSeq is the only metric that gave VII. Evaluation Resultsa relatively higher score that is worth noticing, while all theothers provide less compelling scores. When we looked back at As the first step in evaluating P-DPL, we comparedthe tcpdump of CR and CR II, both of them contained the the two functions’ extraction methods (i.e., SM and IDA) andstring: the common-function-filtering capability of the CFLs that were ―GET./default.ida?........u9090%u6858%ucbd3%u780 generated using the two methods. Fig. 8 shows the percentage1%u9090%u6858%ucbd3%‖while CR has a string of repeated of candidate and common function corpora among all functions―N‖, and CR II has string of repeated ―X‖ padding extracted from the malware set by the SM and IDA-Protheir content. Since subsequences do not need to be adjacent in extraction methods for each CFL size (500, 1000, 2000, 4000,the LCSseq metric, LCSeq ignored the repetitions of and 8000 files). It is evident from the diagram that SM isthe unmatched ―N‖ and X substrings and successfully picked capable of extracting more functions from the malware files:out the other common substrings. LCS also had a higher-than- IDA extracted 57 929 functions, while SM extracted 249 158.average score here, but not as good as LCSeq. This example Function size was limited to a minimum of 16 B and asuggests that polymorphic worms attempting to mask maximum of 256 B.6 The figure also shows that SM is capablethemselves by changing their padding may be detectable by of trimming a larger portion of functions, which appear in thecross-site collaboration under the LCSeq metric. CFL and that the portion of remaining functions becomes Another observation is that the LCSeq and LCS results smaller along with the increase in the size of the CFL.display several packet content alerts with high similarity scores. The first observation, regarding SM’s extractionThese were false alerts generated by the correlation among the superiority, is consistent with IDA-Pro detecting fewersites. The scores were measured at about 0.4 to 0.5. Although functions from the training set. This is probably due to the factthey are still much smaller than the worm scores, they are that IDA-Pro is more rigid software and cannot deal effectivelyalready outliers since they exceeded the score threshold used in with code obfuscation, which is a prominent technique,this experiment. We inspected the content of these packets, and employed by hackers [30]. However, the SM method, bydiscovered that they included long padded strings attempting to nature, works on high recall (extracting as many functions ashide the HTTP headers. Some proxies try to hide the query possible), and low precision (many of the extracted functionidentity by replacing some headers with meaningless characters might not be really functions); thus it extracts more functions.– in our case, consisting of a string of ―Y‖s. Such The second observation, the filtering capability of the twopayloads methods, can be explained straightforwardly by the fact that aswere correlated as true alerts while using LCSeq/LCS as the size of the CFL grows, the likelihood increases that ametrics, although they are not worms. However, these function extracted from the malware set will appear in the CFL.anomalies did not appear when we used the Zstr metric, since For some malware files, it might be the case that all ofthe long string of ―Y’s‖ used in padding the HTTP header the extracted functions were identified as common functions,only influences one position in the Z-string, but has no impact and therefore, were filtered out by the CFL. In such cases, theon the remainder of the Z- string. method cannot generate a signature for the malware. Fig. 9These results suggest that cross-sites collaboration can greatly depicts the percentage of malware that was left withouthelp identify the early appearance of new zero-day worms candidates. The figure shows that IDA missed more malware.while reducing the false positive rates of the constituent P-DPL The reason for that is that 1) it extracts fewer functions and 2)anomaly detectors. The similarity score between worms and IDA only detects functions that are being called from othertheir variants are much higher than those between ―true‖ false functions using standard protocols. This may not be the case inpositives (normal data incorrectly deemed anomalies), and can malware that wishes to camouflage its existence. As expected,be readily separated with high accuracy. for both methods, increasing the CFL also increases the missed When several sites on the Internet detect similar malware, but also the gap between IDA and SM narrows. Figsanomalous payloads directed at them, they can confirm and 10–13 depict the detection rate of candidate signatures andvalidate with each other with high confidence that an attack is signatures selected for the malware set in the control file set.underway. As we mentioned earlier, this strategy can also solve This rate serves as a measure of the false-positive detection ratethe limited buffer size problem described in Section 4.3. If we
  12. 12. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012of malware in benign files.We checked the false-positive rate ofsignature candidates in the control set files. Fig. 10 depicts thepercentage of signature candidate detected in the control set asa function of the candidate’s length in bytes and CFL size. Weexpected that the false-positive rate would drop for longersignature candidates and for larger CFL. The length of asignature candidate affects the probability of finding the samebyte sequence in an arbitrary file. Indeed, regardless of functionand signature extraction techniques, (SM/IDA), shortcandidates caused most of the hits. Consequently, based on thediagram, we recommend using function candidates if theirlength is above 112 B in order to ensure a lower false-positiverate. An exception is shown with the SM with CFL size 500MB and SM with CFL size 1000 MB, where we see a highfalse-positive rate for candidates that are 160–176 B. Using Fig. 9.Percentage/number of common functions versusCFL of 2000 MB and more eliminates the problematic candidate functions extracted from malware set for IDA andcandidates. Additionally, we can see that using a larger CFL SM for several CFL dataset sizes.contributes considerably to lowering the false-positive rate inboth function extraction methods. In Fig. 11, we compare the false-positive rate of thetwo function extraction methods: SM and IDA-Pro for differentCFL sizes. With both extraction methods, the false-positive rateis reduced when a large CFL is used. Compared to IDA, the SM method achieves a lowerfalse-positive rate when using CFL with a size greater than1000 files. Next, we compare the mean false-positive rate(averaged over both SM and IDA function extraction methods)when using candidates with/without a 16-B offset (see Fig. 12)and when randomly choosing a signature or by using theentropy score see Fig. 13). As expected, adding a 16-B offset tothe candidate functions and using the entropy score to choose asignature helps in reducing the percentage of signaturesdetected in the control set files. This was consistent for alltested CFL sizes. The entropy selection method favors large Fig 10. Malware without signature candidates—the percentage(signatures; approximately 80% of the signatures that the of malware that were left without candidates (i.e., all extractedentropy-based method selected were larger than 112 most functions were filtered by the functions in the CFL), and thus,significant improvement when the CFL size is increased. This the method cannot extract shown by the detection rate declining from 2.7% to 0% for aCFL greater than 2000 MB. Note that detection rate in this The worse signature-generation method is IDA whencontext B. Selecting a candidate randomly shows that only 50% not using an offset of 16 B. However, this method, whenare 112 B and more. This observation complies with the results combined with CFL containing 8000presented in Fig. 9 regarding the recommended size ofcandidates. Since the entropy method evidently showed betterresults than Rand, we continued to investigate the signature-generating methods using only the entropy method. The goal of the next and final experiment was to showhow IDA and SM methods are affected by using offset alongwith the signature candidate for different CFL sizes. It alsoprovides the most significant improvement when the CFL sizeis increased. This is shown by the detection rate declining from2.7% to 0% for a CFL greater than 2000 MB. Note thatdetection rate in this context relates to the false positives or,undesirable detection on a malware signature in benign files;thus, a lower detection rate is better. Fig. 11. False-positive rate of candidates in the control set files as a function of the candidate size in bytes.
  13. 13. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012files, still manages to have a low false positive rate (FPR) of Fig. 13.Comparing the mean false-positive rate (averaged over0.4%. Finally, we tested the signatures generated by P-DPL for SM and IDA function extraction methods) with/without addingfalse negatives using a DefensePro intrusion detection offset bytes to function candidates.appliance. False negatives in the context of P-DPL mean that asignature generated for a malware file was not identified in an Fig. 14. Comparing the mean false-positive rate (averaged over SM and IDA function extraction methods) when using random (Rand) signature selection and entropy-based selection. TheFig. 12. Comparing the false-positive rate for the two function entropy-based heuristic performs better than when using theextraction methods: SM and IDA-Pro (IDA) as a function of the random selection method, for all CFL sizes.CFL size. Entropy-based selection. The entropy-based heuristicinstance of the same malware (e.g., as a result of a long with performs better than when using the random selection method,signature split over multiple packets). Therefore, false for all CFL sizes.Different fragmentation from given graft chartnegatives depend on the detection engine. The malware over SM and IDA function extraction methods.detection capability of DefensePro is based on IP packetinspection and does not reconstruct the files. We uploaded thesignatures extracted by P-DPL to the DefensePro signaturedatabase and configured the device to reset any session forwhich a packet was identified with a malware signature. Wetransmitted all malware (for which P-DPL successfullygenerated a signature) via the DefensePro (at the maximalspeed that we could load the link) and executed several tests inwhich DefensePro successfully removed all malware. Additionally, we measured the time required by P-DPL to generate a signature. The extraction time of a signatureas a function of the file size range from 100 - 900. A linearincrease in signature extraction time as a function of the filesize. VIII. Conclusion In this paper we propose a new automatic mechanism, as P-DPL for extracting signatures from malware files. Signatures generated by P-DPL are comprised of multiple byte- strings, which are used by high speed network and malware filtering devices. To minimize the risk of false positives P-DPL employs a method for creating extracting files from originate from the underlying standard development platforms and malicious programs developed by these platforms.
  14. 14. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012 We tested our method in a network-security laboratory based network intrusion detection systems,‖ IEEE Trans.on various configurations in terms IDA-Pro, SM the IDA-Pro, Syst. Man, Cybern.—Part C, vol. 38, no. 5, pp. 649–659,SM are fast technique for extracting functions from assembly Sep. 2008.files. However SM for new compilers done by manually thismakes high prone to errors. In order to overcome this [8] A. Shabtai, D. Potashnik, Y. Fledel, R. Moskovitch, and Y.limitation, we are developed P-DPL and support viability of the Elovici, ―Monitoring, analysis and filtering system forgeneral approach proposed by this research which suggested purifying network traffic of known and unknownthat general code identified as the functions in the program can malicious content,‖ Secur. Commun. Netw. [Online].be discarded. Realizing P-DPL in generating signatures for high DOI:speed network appliances for system methodology for building 10.1002/sec.229.common repositories. The global variety of developmentplatforms are facilitated by the Internet, ensuring the external [9] Y. Tang, B. Xiao, and X. Lu, ―Using a bioinformaticsvalidity of this study relies substantially on reaching a critical approach to generate accurate exploit-based signatures formass of malware files. P-DPL is also helpful for creating and polymorphic worms,‖ Comput.Secur., vol. 28, pp.indentify allergy attacks against any signature is automatically 827–created this type of attack is mainly relevant based and 842, 2009.realistic that use machine-learning algorithms In learning-basealgorithm can make the automated signature-generation method [10] Jacob, G., Debar, H., Filiol, E.: Behavioral detection ofto consider malicious data. malware: from a survey towards an established taxonomy. In order to cope with these fully obfuscated malware Journal in Computer Virology 4(3) (2008)on the packet easily on the high-speed deep packet inspectiondevices. We believe that P-DPL should be implemented for [11] Singh, S., Estan, C., Varghese, G., Savage, S.: Automatedfiltering most of the malware by using high-speed malware worm fingerprinting. In: OSDI’04: Proceedings of the 6thfiltering devices. We plan for detecting and extracting the conference on Symposium on Operating Systems Designbinary code as well as selecting the best signature out of the & Implementation, Berkeley, CA, USA, USENIXcollection of candidates using probability and variance method. Association (2004) 4–4In regular expressions defined by two or more distinctsignatures can be used in order to minimize the risk of [12] Newsome, J., Song, D.: Dynamic taint analysis formalwares. automatic detection, analysis, and signature generation of exploits on commodity software. In: Proceedings of the References Network and Distributed System Security Symposium (NDSS 2005). (2005)[1] S. B. Cho, ―Incorporating soft computing techniques into a probabilistic intrusion detection system,‖ IEEE Trans. [13] Costa, M., Crowcroft, J., Castro, M., Rowstron, A., Zhou, Syst., Man, Cybern.—Part C, vol. 32, no. 2, pp. 154–160, L., Zhang, L., Barham, P.: Vigilante: end-to-end May 2002. containment of internet worms. In: SOSP ’05: Proceedings of the twentieth ACM symposium on Operating systems[2] ssel, and P. Laskov, principles, New York, NY, USA, ACM (2005) 133–147 ―Learning and u classification of malware behavior,‖ in Proc. Conf. Detect. Intrusions Malware Vulnerability [14 ] Xu, J., Ning, P., Kil, C., Zhai, Y., Bookholt, C.: Automatic Assessment, Springer Press, 2008, pp. 108–125. diagnosis and response to memory corruption vulnerabilities. In: CCS ’05: Proceedings of the 12th ACM[3] M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. conference on Computer and communications security, Jahanian, and J. Nazario, ―Automated classification and New York, NY, USA, ACM (2005) 223–234 analysis of internet malware,‖ in Proc. 12th Int. Symp. Recent Adv. Intrusion Detect., Springer Press, 2007, pp. [15] M. Damashek. Gauging similarity with n-grams: language 178–197. independent categorization of text. Science, 267(5199) :843--848, 1995[4] K. Griffin, S. Schneider, X. Hu, and T. Chiueh, ―Automatic generation of string signatures for malware detection,‖ in [16] R. Lippmann, et al. The 1999 DARPA Off-Line Intrusion Proc. 12th Int. Symp. Recent Adv. Intrusion Detect., Detection Evaluation, Computer Networks 34(4) 579-595, Springer Press, 2009, pp. 101–120. 2000.[5] G. Jacob, H. Debar, and E. Filiol, ―Behavioral detection of [17] M. Locasto, J. Parekh, S. Stolfo, A. Keromytis, T. Malkin malware: From a survey towards an established and V. Misra. Collaborative Distributed Intrusion taxonomy,‖ J. Comput. Virol. vol. 4, pp. 251–266, 2008. Detection, Columbia University Tech Report CUCS-012- 04, 2004.[6] D. Gryaznov, ―Scanners of the year 2000: Heuristics,‖ in Proc. 5th Int. Virus Bull., 1999, pp. 225–234. [18] K. Wang and S. Stolfo. Anomalous payload-based network [7] J. Zhang, M. Zulkernine, and A. Haque, ―Random-forests- intrusion detection, in Proceedings of Recent Advance in Intrusion Detection (RAID), Sept. 2004.
  15. 15. ISSN: 2278 – 1323 International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 4, June 2012AUTHOR DETAILS 1 N.Kannaiya Raja received MCA degree from Alagappa University and ME degree in Computer Science and Engineering from Anna University Chennai in 2007 and he is pursing PhD degree in Manonmaniam Sundranar University from 2008 and joined assistant professor in various engineering collages in Tamil Nadu affiliated to Anna University and has eight years teaching experience his research work in deep packet inspection. He has beensession chair in major conference and workshops in computer vision onalgorithm, network, mobile communication, image processing papers andpattern reorganization. His current primary areas of research arepacket inspection and network. He is interested to conduct guest lecturer invarious engineering in Tamil Nadu. 2 Dr.K.Arulanandam received Ph.D. doctorate degree in 2010 from Vinayaka Missions University. He has twelve years teaching experience in various engineering colleges in Tamil Nadu which are affiliated to Anna University and his research experience network, mobile communication networks, image processing papers and algorithm papers. Currently working in Ganadipathy Tulasi’s Jain Engineering College Vellore. 3 M.Balaji received B.Tech degree in Information Technology from Anna University Chennai in 2008 and now pursuing ME degree in Computer Science and Engineering in Arulmigu Meenakshi Amman College of Engineering affiliated to Anna University Chennai.