Your SlideShare is downloading. ×
DDoS mitigation through a collaborative trust-based request prioritization
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

DDoS mitigation through a collaborative trust-based request prioritization


Published on

Master Thesis

Master Thesis

Published in: Education, Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Disegno e contromisure per attacchi DDoSeffettuati da Botnet-DDoS mitigation through a collaborativetrust-based request prioritizationFaculty of Mathematical, Physical and Natural SciencesMaster Degree in Computer ScienceCandidate:Davide PaltrinieriID Number 1225725 Co-advisors:Advisor: Prof. Neeraj SuriProf. Massimo Bernaschi Ing. Mirco MarchettiAcademic Year 2009/2010
  • 2. AcknowledgmentsThe nancial assistance of EU project CoMiFin (Communication Middlewarefor Monitoring Financial Critical Infrastructures (IST-225407)) towards thisresearch is hereby acknowledged. Expressed opinions and reached conclusions are those of the author andare not necessarily attribuitable to CoMiFin. Preface I would like to express my sincere appreciation to a number of peopleand institutions for their support, useful comments and suggestions. Thefollowing are included: I thank Professor Massimo Bernaschi for his passion and skills in teachingthe course on operating systems II. I would like to thank him in particularfor the incredible freedom and availability to me, demonstrated throughoutthe course of this study. I am heartily thankful to Mirco Marchetti for the gratuitousness withwhich he heard me out from the beginning, I thank him because withouthis condence, my thesis proposal would never have taken a real shape. Ithank him for his competences, because any comparison with him have alwaysbrought great results; for his time and willingness to receive me in spite ofholidays and closure of the department. It is an honour for me to thank Professor Neeraj Suri for his trust. Heallowed me to complete this study in absolute freedom and discretion. Iwould like to thank him especially for the soul that he has managed to passto his DEEDS research group. From the DEEDS group I would like to thank Majid Khelil and HamzaGhani, for leaving me free to explore and to deepen a research eld of myinterest, for helping me to formulate the model on the basis of all the ideasI had in my mind. Their ability to read my mind was certainly not a trivialundertaking. i
  • 3. ii I am in indebted to Stefan, who welcomed me more like a friend thana collegue. I thank him for all chats we had during our run races in Her-rengarten, for all kinds of beer we took (which I hope will never end...), formaking me understand that there are no suitable weather conditions for goodhome-made icecream. Im very thankful to him and Maria for having openedtheir home doors for me as you do usually for an old friend. Id like to thank Thorsten, because even if he did not follow me till thenal step, without his encouragement I could never nish the DarmstadtMarathon. I thank him for being my stounch companion from the rst tothe last day in oce, and especially for being a friend inside and outside theoce walls. I thank Ute, for her high readiness to help, and her greatness of heart.I Thank her for all the cakes and for oering me her home as my farewellparty place. I would like to thank Daniel because without his assistance I could nevereat at the best worscht in town until getting to the FBI level. My great thanks to Marco for his sympathy, and for his managing meduring my rst steps in the DEEDS group. Im greatful to all other DEEDS members, for all those cakes and beersthat they shares with from my rst days in Germany. This thesis would not have being possible unless a help of Jelena Mirkovic,the wealth of her studies, her passion and self-forgetfulness inspiration. Imvery thankful to her because she taught me the importance of sharing re-search results, because without her the DETERlab would probably not ex-ist. I thank her and all the system administrator of DETERlab I worked withpersonally. They helped me a lot to solve the problems that surfaced duringthe implementation phase. My special thanks to Corrado Leita for helping me to understand in moredetails the operation of WOMBAT, till the point of having shared with methe results of one of his articles before its publication. I thank him for givingme the credentials for access to HARMUR and SGNET datasets, throughWAPI, and for letting me touch the power of WOMBAT. I thank SMT2 OWA developers for helping me in debugging the errorsdue to some esoteric implementations of my project. I would like to show my gratitude to Stefano Zanero and all other membersof the Italian Chapter of IISFA for sharing with me their opinions and pointsof views, in particularly the Privacy issues. Id like to thank all the friends of IISFA because they keep on transmit-ting a style of professional growth that will never loose the importance of itshuman and relational aspects.
  • 4. iii I thank Ekaterina, Ameed, Herve for deep chats and sharing in thekitchen, for accomodation in student residence, to be short, for being thebest possible atmates ever. My heartfelt thanks to all those who have accompanied me in these yearsand helped me to achieve my goal. Finally, I oer my regards and blessings to all those who supported mein any respect during the completion of the project. Davide
  • 5. iv
  • 6. Contents1 Introduction 12 DDoS and Existing Countermeasures 5 2.1 Survey of DDoS Attacks . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Attack organization . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Taxonomy of DDoS Attacks . . . . . . . . . . . . . . . 8 2.1.3 Bandwidth Depletion Attacks . . . . . . . . . . . . . . 9 2.1.4 Resource Depletion Attacks . . . . . . . . . . . . . . . 11 2.2 Survey of DDoS Countermeasure . . . . . . . . . . . . . . . . 14 2.2.1 Pushback (2002) . . . . . . . . . . . . . . . . . . . . . 14 2.2.2 Web-SOS (2004) . . . . . . . . . . . . . . . . . . . . . 15 2.2.3 DefCOM (2003-2006) . . . . . . . . . . . . . . . . . . . 17 2.2.4 Speak-Up: DDoS Defense by Oense (2006) . . . . . . 19 2.2.5 OverDoSe: A Generic DDoS Protection Service Using an Overlay Network (2006) . . . . . . . . . . . . . . . . 19 2.2.6 Portcullis (2006) . . . . . . . . . . . . . . . . . . . . . 21 2.2.7 Stop-It (2008) . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.8 TVA: Trac Validation Architecture (2009) . . . . . . 23 2.2.9 DDoS-Shield (2009) . . . . . . . . . . . . . . . . . . . . 23 2.3 Relevant Collaborative Infrastructure Systems . . . . . . . . . 25 2.3.1 CoMiFIN . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3.2 WOMBAT . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.3.3 DShield . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.3.4 FIRE: FInding RoguE Networks . . . . . . . . . . . . . 313 Models Overview 33 3.1 Attacker and Victim Model . . . . . . . . . . . . . . . . . . . 33 3.1.1 Victim Model . . . . . . . . . . . . . . . . . . . . . . . 33 3.1.2 Attacker Model . . . . . . . . . . . . . . . . . . . . . . 35 3.1.3 Defense Model . . . . . . . . . . . . . . . . . . . . . . 36 3.2 Defense Strategies . . . . . . . . . . . . . . . . . . . . . . . . . 37 v
  • 7. vi CONTENTS 3.2.1 Detection . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.2.2 Classication . . . . . . . . . . . . . . . . . . . . . . . 38 3.2.3 Response . . . . . . . . . . . . . . . . . . . . . . . . . 404 Software Architecture 41 4.1 Logical Description . . . . . . . . . . . . . . . . . . . . . . . . 42 4.1.1 Detection . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.1.2 Classication . . . . . . . . . . . . . . . . . . . . . . . 45 4.1.3 Response . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2 Models Components Description . . . . . . . . . . . . . . . . 50 4.2.1 Border Router . . . . . . . . . . . . . . . . . . . . . . . 50 4.2.2 SP: Smart Proxy . . . . . . . . . . . . . . . . . . . . . 51 4.2.3 SC: Suspicion Checking . . . . . . . . . . . . . . . . . . 52 4.2.4 ADL: Actions Deep Logger . . . . . . . . . . . . . . . 54 4.2.5 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2.6 TCDB: Trust Collaborative Database . . . . . . . . . . 57 4.2.7 Auditing Room . . . . . . . . . . . . . . . . . . . . . . 59 4.2.8 Target Server . . . . . . . . . . . . . . . . . . . . . . . 615 Software Implementation 63 5.1 Required Software . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.1.1 SeleniumHQ . . . . . . . . . . . . . . . . . . . . . . . . 63 5.1.2 TC: Trac Control . . . . . . . . . . . . . . . . . . . . 65 5.1.3 SMT2: Simple Mouse Tracking . . . . . . . . . . . . . 67 5.1.4 OWA: Open Web Analytics . . . . . . . . . . . . . . . 67 5.1.5 Dosmetric . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.1.6 Other used software . . . . . . . . . . . . . . . . . . . . 68 5.2 Implementation Architecture Overview . . . . . . . . . . . . . 71 5.3 Implementation Components Description . . . . . . . . . . . . 72 5.3.1 Internet . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3.2 Border Router . . . . . . . . . . . . . . . . . . . . . . 73 5.3.3 SmartProxy . . . . . . . . . . . . . . . . . . . . . . . . 74 5.3.4 WS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.3.5 ADL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.3.6 Static Trust DataBase . . . . . . . . . . . . . . . . . . 76 5.3.7 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 76 5.3.8 Audit Room . . . . . . . . . . . . . . . . . . . . . . . . 77 5.4 Testbed description . . . . . . . . . . . . . . . . . . . . . . . . 77 5.4.1 DETERlab: cyber-DEfense Technology Experimental Research laboratory Testbed . . . . . . . . . . . . . . . 78 5.4.2 Hardware Cluster Details . . . . . . . . . . . . . . . . . 78
  • 8. CONTENTS vii 5.4.3 SEER . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.4.4 Custom Script . . . . . . . . . . . . . . . . . . . . . . . 81 5.4.5 Local test porting to DETERlab . . . . . . . . . . . . 826 Test Evaluation 85 6.1 Test Description . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.1.1 Single Run Experiment . . . . . . . . . . . . . . . . . . 85 6.1.2 Parameters Tuning . . . . . . . . . . . . . . . . . . . . 88 6.2 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2.1 Legitimate Client only . . . . . . . . . . . . . . . . . . 90 6.2.2 Small botnet . . . . . . . . . . . . . . . . . . . . . . . 90 Rening priority after rst attack . . . . . . . 92 6.2.3 Medium botnet . . . . . . . . . . . . . . . . . . . . . . 94 Rening priority after rst attack . . . . . . . 98 6.2.4 Large botnet . . . . . . . . . . . . . . . . . . . . . . . 100 Rening priority after rst attack . . . . . . . 101 6.2.5 Auditing Process . . . . . . . . . . . . . . . . . . . . . 103 6.3 Test Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 1067 Conclusion and Future Work 109 7.1 Conlusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110A Custom Script s 113 A.1 Bots Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 A.1.1 batchlib.cfg . . . . . . . . . . . . . . . . . . . . . . . . . 114 A.1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 A.1.3 . . . . . . . . . . . . . . . . . . . . . . . . . 120 A.1.4 . . . . . . . . . . . . . . . . . . . . . . . 122 A.1.5 Legitimate client session . . . . . . . . . . . . . . . . . . 126 A.2 DETERlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 A.2.1 test-4.ns . . . . . . . . . . . . . . . . . . . . . . . . . . 128 A.2.2 Nodess cluster deployment . . . . . . . . . . . . . . . . . 132 A.3 SmartProxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 A.3.1 tc.c (before DDoS) . . . . . . . . . . . . . . . . . . . . . 136 A.3.2 TC rules . . . . . . . . . . . . . . . . . . . . . . . . . . 137 A.3.3 tc.c (Post DDoS) . . . . . . . . . . . . . . . . . . . . . . 138Bibliography 139
  • 9. viii CONTENTS
  • 10. Chapter 1IntroductionScalability and openness are among the main principles that inspired the de-sign of the Internet, as well as two critical factors in its success. However, asignicant drawback of the open design of the Internet is represented by thelack of inherent security guarantees. On the Internet, anyone can send anypacket to anyone without being authenticated, including unsolicited packetsthat deliver useless or even malicious payloads. Moreover, the lack of authen-tication means that attackers can hide their real identity, making it dicultfor victims and for law enforcement agencies to pinpoint the real source ofthe attack. All systems connected to the Internet are potential targets forattacks since the openness of the Internet makes them accessible to malicioustrac. A Denial of Service (DoS) attack aims to prevent legitimate clients toaccess a service by making it unavailable. When the trac of a DoS attackcomes from multiple sources, possibly geographically distributed across theInternet, we call it a Distributed Denial of Service (DDoS) attack. By usingmultiple attack sources it is possible to amplify the amount of attack trac,therby increasing the eectiveness of a DDoS attack and making it extremelydicult to defend against it. The rst DDoS attacks date back to 1988 with the release of the Morrisworm [52]. Since that time DDoS attacks have drastically increased in termsof both frequency and impact. Many DDoS attacks are motivated by politicalreasons, and recently several instances of DDoS attacks targeted well knownweb sites and nancial institutions. Their eectiveness is also testied by theemphasis with which DDoS attacks are often described in the news. 1
  • 11. 2 CHAPTER 1. INTRODUCTION In 2010, several Bollywood companies hired Aiplex Software to launch DDoS attacks on websites that did not respond to software take-down notices. Piracy activists then created Operation Payback in September 2010 in retal- iation. The original plan was to attack Aiplex Software directly, but upon nding some hours before the planned DDoS that another individual had taken down the rms website, Operation Payback moved to launching at- tacks against the websites of the copyright stringent organizations Motion Picture Association of America (MPAA) and International Federation of the Phonographic Industry, giving the two websites a combined total downtime of 30 hours [50]. Later, in December 2010, WikiLeaks came under intense pressure to stop publishing secret United States diplomatic cables. Corporations such as Amazon, PayPal, BankAmerica, PostFinance, MasterCard and Visa either stopped working with or froze donations to WikiLeaks, some due to political pressure. In response, those behind Operation Payback directed their activ- ities against these companies for dropping support to WikiLeaks. Operation Payback launched DDoS attacks against PayPal, the Swiss bank PostFinance and the Swedish Prosecution Authority. In December a coordinated DDoS attack by Operation Payback brought down both the MasterCard and Visa websites too [55]. Figure 1.1: Operation Payback results There are two main kinds of DDoS attacks. The rst kind leverages soft-
  • 12. 3ware vulnerabilities of a target by sending malformed packets that cause theattacked service to crash. As opposed to that, the second kind of DDoSuse massive volumes of useless trac to occupy all the resources of the at-tacked server, thus preventing it from serving requests coming from legitimateclients. While it is relatively simple to protect a server against the rst formof attack (by patching known vulnerabilities), the second form of attack isnot so easily prevented. Any server instantly becomes a potential target assoon as its connection to the public Internet makes it reachable by any otherconnected host. The attacked resources are typically the bandwidth available to the targetweb server, or the server-farm/data-center resources, such as Hard Disks,CPUs and Databases. In particular, it is possible to carry out eectiveDDoS attacks that only require relatively small trac volumes by issuingseveral requests that require access to slow resources (such as databases) orthat cause the server to perform resource-intensive computations. Recentresearch [53] shows that attackers are able to nd such bottlenecks, and toexploit them. For example, if they nd a bottleneck in the database they will try tostu the database and put it in a lock state. They will also make attackrequests as similar as possible to requests coming from legitimate clients,thus making it impossible to defend against the attack through trivial requestltering techniques. The urgent need for eective solutions against such kindof attacks is therefore evident. Even though there are many books that deal with the subject and com-mercial solutions available, these types of attacks continue to prove eectiveand cause extensive damage. The aim of this thesis is to design and evaluate new tools to defend againstapplication layer DDoS attacks [53]. A comprehensive survey of DDoS attack and mitigation strategies alreadypresented in the scientic literature is provided in Chapter 2. In the samechapter we also describe other relevant research projects that focus on col-laborative defense approaches, even though they do not focus specically onDDoS attacks. In Chapter 3 we dene the model of the attacks that we try to mitigate.The aim of these attacks is the exhaustion of hardware resources of theattacked server(s) and not the saturation of their bandwidth. Their strengthlies in the fact that it is very dicult to distinguish them from the tracgenerated by legitimate clients. We also dene a new attack mitigation modelby taking the inspiration from previous solutions. In Chapter 4 we present the logical description of the proposed solution,and we provide details on the inner working of each component.
  • 13. 4 CHAPTER 1. INTRODUCTION The implementation of the proposed solution is described in Chapter 5,while Chapter 6 describes the experiments carried out to demonstrate theeectiveness of the proposed solution.
  • 14. Chapter 2Distributed Denial of ServiceAttacks and ExistingCountermeasuresIn this chapter we describe rst a brief survey of the best known and mostpopular DDoS attacks, while in the part 2.2, we provide a description inliterature of the state of the art for mitigation solutions with a brief analysisof their pros and cons. Finally, in the last section we will mention somecollaborative infrastructure projects which have inspired and contributed tothe development of the solution we propose in this study.2.1 Survey of DDoS AttacksBelow we will speak rst of the methods mostly used by attackers to formarmies (botnet) and to wage attacks from it, and after we will describe indetails the most popular DDoS attacks.2.1.1 Attack organizationAgent-Handler modelAn Agent-Handler DDoS attack network consists of clients, handlers, andagents as shown in gure 2.1 5
  • 15. 6 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURES Figure 2.1: DDoS Agent Handler Attack model The client platform is where the attacker communicates with the restof the DDoS attack network. The handlers are software packages located oncomputing systems throughout the Internet that the attacker uses to commu-nicate indirectly with the agents. The agent software exists in compromisedsystems that will eventually carry out the attack on the victim system. Theattacker communicates with any number of handlers to identify which agentsare up and running, when to schedule attacks, or when to upgrade agents.Depending on how the attacker congures the DDoS attack network, agentscan be instructed to communicate with a single handler or multiple handlers.Usually, attackers will try and place the handler software on a compromisedrouter or network server that handles large volumes of trac. This makes itharder to identify messages between the client and handler and between thehandler and agents. The communication between attacker and handler andbetween the handler and agents can be via TCP, UDP, or ICMP protocols.The owners and users of the agent systems typically have no knowledge thattheir system has been compromised and will be taking part in a DDoS attack.When participating in a DDoS attack, each agent program uses only a smallamount of resources (both in memory and bandwidth), so that the users ofthese computers experience minimal change in performance. In descriptionsof DDoS tools, the terms handler and agents are sometimes replaced withmaster and daemons respectively. Also, the systems that have been violatedto run the agent software are referred to as the secondary victims, while thetarget of the DDoS attack is called the (primary) victim. An Agent-HandlerDDoS attack network consists of clients, handlers, and agents (see Figure2). The client platform is where the attacker communicates with the rest ofthe DDoS attack network. The handlers are software packages located oncomputing systems throughout the Internet that the attacker uses to com-municate indirectly with the agents. The agent
  • 16. 2.1. SURVEY OF DDOS ATTACKS 7IRC-Based DDoS Attack ModelInternet Relay Chat (IRC) is a multi-user, on-line chatting system. It allowscomputer users to create two-party or multi-party interconnections and typemessages in real time to each other [13]. IRC network architectures consistof IRC servers that are located throughout the Internet with channels tocommunicate with each other across the Internet. IRC chat networks allowtheir users to create public, private and secret channels. Public channelsare channels where multiple users can chat and share messages and les.Public channels allow users of the channel to see all the IRC names andmessages of users in the channel. Private and secret channels are set up byusers to communicate with only other designated users. Both private andsecret channels protect the names and messages of users that are logged onfrom users who do not have access to the channel. Although the contentof private channels is hidden, certain channel locator commands will allowusers not on the channel to identify its existence, whereas secret channels aremuch harder to locate unless the user is a member of the channel. An IRC-Based DDoS attack network is similar to the Agent-Handler DDoS attackmodel except that instead of using a handler program installed on a networkserver, an IRC communication channel is used to connect the client to theagents. By making use of an IRC channel, attackers using this type of DDoSattack architecture have additional benets. For example, attackers can use legitimate IRC ports for sending commands to the agents[13]. This makestracking the DDoS command packets much more dicult. Additionally, IRCservers tend to have large volumes of trac making it easier for the attackerto hide his presence from a network administrator. A third advantage isthat the attacker no longer needs to maintain a list of agents, since he cansimply log on to the IRC server and see a list of all available agents[13]. Theagent software installed in the IRC network usually communicates to theIRC channel and noties the attacker when the agent is up and running. Afourth advantage is that IRC networks also provide the benet of easy lesharing. File sharing is one of the passive methods of agent code distributionthat we discuss in Section 4. This makes it easier for attackers to securesecondary victims to participate in their attacks. In an IRC-based DDoSattack architecture, the agents are often referred to as Zombie Bots or Bots . In both IRC-based and Agent-Handler DDoS attack models, we willrefer to the agents as secondary victims or zombies.
  • 17. 8 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURES2.1.2 Taxonomy of DDoS AttacksThere are a wide variety of DDoS attack techniques. We propose a taxonomyof the main DDoS attack methods in Figure 2.2. There are two main classesof DDoS attacks: bandwidth depletion and resource depletion attacks[12].A bandwidth depletion attack is designed to ood the victim network withunwanted trac that prevents legitimate trac from reaching the (primary)victim system. A resource depletion attack is an attack that is designed to tieup the resources of a victim system. This type of attack targets a server orprocess on the victim system making it unable to process legitimate requestsfor service. Figure 2.2: DDoS Taxonomy There are two main classes of DDoS attacks: bandwidth depletion andresource depletion attacks. A bandwidth depletion attack is designed to oodthe victim network with unwanted trac that prevents legitimate trac fromreaching the (primary) victim system. A resource depletion attack is anattack that is designed to tie up the resources of a victim system. This typeof attack targets a server or process on the victim system making it unableto process legitimate requests for service.
  • 18. 2.1. SURVEY OF DDOS ATTACKS 92.1.3 Bandwidth Depletion AttacksThere are two main classes of DDoS bandwidth depletion attacks. A oodattack involves the zombies sending large volumes of trac to a victim sys-tem, to congest the victim systems bandwidth. An amplication attackinvolves either the attacker or the zombies sending messages to a broadcastIP address, using this to cause all systems in the subnet reached by the broad-cast address to send a message to the victim system. This method ampliesmalicious trac that reduces the victim systems bandwidth.Flood AttacksIn a DDoS ood attack the zombies ood the victim system with IP trac.The large volume of packets sent by the zombies to the victim system slows itdown, crashes the system or saturates the network bandwidth. This preventslegitimate users from accessing the victim.UDP Flood Attacks.User Datagram Protocol (UDP) is a connectionless protocol. When datapackets are sent via UDP, there is no handshaking required between senderand receiver, and the receiving system will just receive packets it must pro-cess. A large number of UDP packets sent to a victim system can saturatethe network, depleting the bandwidth available for legitimate service requeststo the victim system. In a DDoS UDP Flood attack, the UDP packets aresent to either random or specied ports on the victim system. Typically,UDP ood attacks are designed to attack random victim ports. This causesthe victim system to process the incoming data to try to determine whichapplications have requested data. If the victim system is not running anyapplications on the targeted port, then the victim system will send out anICMP packet to the sending system indicating a destination port unreach-able message. Often, the attacking DDoS tool will also spoof the sourceIP address of the attacking packets. This helps hide the identity of the sec-ondary victims and it insures that return packets from the victim systemare not sent back to the zombies, but to another computer with the spoofedaddress. UDP ood attacks may also ll the bandwidth of connections lo-cated around the victim system (depending on the network architecture andline-speed). This can sometimes cause systems connected to a network neara victim system to experience problems with their connectivity.
  • 19. 10 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURESICMP Flood Attacks.Internet Control Message Protocol (ICMP) packets are designed for networkmanagement features such as locating network equipment and determiningthe number of hops or round-trip-time to get from the source location to thedestination. For instance, ICMP_ECHO_REPLY packets ( ping ) allow theuser to send a request to a destination system and receive a response withthe round trip time. A DDoS ICMP ood attack occurs when the zombiessend large volumes of ICMP_ECHO_REPLY packets to the victim system.These packets signal the victim system to reply and the combination of tracsaturates the bandwidth of the victims network connection. As for the UDPood attack, the source IP address may be spoofed.Amplication AttacksA DDoS amplication attack is aimed at using the broadcast IP addressfeature found on most routers to amplify and reect the attack (see gure2.3). Figure 2.3: Amplication Attacks This feature allows a sending system to specify a broadcast IP addressas the destination address rather than a specic address. This instructs the
  • 20. 2.1. SURVEY OF DDOS ATTACKS 11routers servicing the packets within the network to send them to all the IPaddresses within the broadcast address range. For this type of DDoS attack,the attacker can send the broadcast message directly, or the attacker can usethe agents to send the broadcast message to increase the volume of attackingtrac. If the attacker decides to send the broadcast message directly, thisattack provides the attacker with the ability to use the systems within thebroadcast network as zombies without needing to inltrate them or installany agent software. We further distinguish two types of amplication attacks,Smurf and Fraggle attacks.Smurf Attacks. In a DDoS Smurf attack, the attacker sends packetsto a network amplier (a system supporting broadcast addressing), with thereturn address spoofed to the victims IP address. The attacking packets aretypically ICMP ECHO REQUESTs, which are packets (similar to a ping )that request the receiver to generate an ICMP ECHO REPLY packet. Theamplier sends the ICMP ECHO REQUEST packets to all of the systemswithin the broadcast address range, and each of these systems will return anICMP ECHO REPLY to the target victims IP address. This type of attackamplies the original packet tens or hundreds of times.Fraggle Attacks. A DDoS Fraggle attack is similar to a Smurf attack inthat the attacker sends packets to a network amplier. Fraggle is dierentfrom Smurf in that Fraggle uses UDP ECHO packets instead of ICMP ECHOpackets [56]. There is a variation of the Fraggle attack where the UDP ECHOpackets are sent to the port that supports character generation (chargen, port19 in Unix systems), with the return address spoofed to the victims echoservice (echo, port 7 in Unix systems) creating an innite loop [56]. The UDPFraggle packet will target the character generator in the systems reached bythe broadcast address. These systems each generate a character to send tothe echo service in the victim system, which will resend an echo packet backto the character generator, and the process repeats. This attack generateseven more bad trac and can create even more damaging eects than just aSmurf attack.2.1.4 Resource Depletion AttacksDDoS resource depletion attacks involve the attacker sending packets thatmisuse network protocol communications or sending malformed packets thattie up network resources so that none are left for legitimate users.
  • 21. 12 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURESProtocol Exploit AttacksTCP SYN Attacks. In TCP SYN Attack the attacker sends a successionof SYN requests to a targets system. When a client attempts to start a TCPconnection to a server, the client and server exchange a series of messageswhich normally runs like this: 1. The client requests a connection by sending a SYN (synchronize) mes- sage to the server. 2. The server acknowledges this request by sending SYN-ACK back to the client. 3. The client responds with an ACK, and the connection is established.This is called the TCP three-way handshake, and is the foundation for everyconnection established using the TCP protocol. The TCP SYN attack worksif a server allocates resources after receiving a SYN, but before it has receivedthe ACK. In a DDoS TCP SYN attack, the attacker instructs the zombies to sendsuch bogus TCP SYN requests to a victim server in order to tie up theservers processor resources, and hence prevent the server from respondingto legitimate requests. The TCP SYN attack exploits the three-way hand-shake between the sending system and the receiving system by sending largevolumes of TCP SYN packets to the victim system with spoofed source IPaddresses, so the victim system responds to a non-requesting system. Even-tually, if the volume of TCP SYN attack requests is large and they continueover time, the victim system will run out of resources and be unable to re-spond to any legitimate users.PUSH + ACK Attacks. In the TCP protocol, packets that are sent toa destination are buered within the TCP stack and when the stack is full,the packets get sent on to the receiving system. However, the sender canrequest the receiving system to unload the contents of the buer before thebuer becomes full by sending a packet with the PUSH bit set to one. PUSHis a one-bit ag within the TCP header [13]. TCP stores incoming datain large blocks for passage on to the receiving system in order to minimizethe processing overhead required by the receiving system each time it mustunload a non-empty buer. The PUSH + ACK attack is similar to a TCPSYN attack in that its goal is to deplete the resources of the victim system.The attacking agents send TCP packets with the PUSH and ACK bits setto one. These packets instruct the victim system to unload all data in the
  • 22. 2.1. SURVEY OF DDOS ATTACKS 13TCP buer (regardless of whether or not the buer is full) and send anacknowledgement when complete. If this process is repeated with multipleagents, the receiving system cannot process the large volume of incomingpackets and it will crash.Malformed Packet AttacksA malformed packet attack is an attack where the attacker instructs thezombies to send incorrectly formed IP packets to the victim system in orderto crash the victim system. There are two types of malformed packet attacks.In an IP address attack, the packet contains the same source and destinationIP addresses. This can confuse the operating system of the victim systemand cause the victim system to crash. In an IP packet options attack, amalformed packet may randomize the optional elds within an IP packetand set all quality of service bits to one so that the victim system must useadditional processing time to analyze the trac. If this attack is multipliedusing enough agents, it can shut down the processing ability of the victimsystem.Application Layer AttackIn a just-released report on botnet-generated DDoS attacks, Prolexic notedthat attackers are quickly tweaking their botnets to make attack trac lookincreasingly similar to legitimate, routine trac.Instead of the huge burst of trac that marks when an attack begins, tracwill begin to ramp up slowly as bots join the attack at random intervals witheach bot varying its attack style, making it increasingly dicult to separatereal users from bots, the report said [64]. There are several tools that automatically launch Layer 7 DDoS Attacks.Really interesting is theApache L7DA (Layer 7 DDoS Attacks) The tool works by exhaustingApache processes; this is done by sending incomplete request headers soApache keeps waiting for the nal header line to arrive, the tool instead justsends a bogus header to keep the connection open. Besides Apache (bothversions 1.x and 2.x), Squid is also aected. Knowing how many serversrunning on Apache there are, this makes the tool very dangerous since itdoesnt require absolutely any knowledge from the attacker. All he/she hasto do is run the tool and the target site goes down.
  • 23. 14 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURES2.2 Survey of DDoS CountermeasureAmong various proposals, we can see mainly two schools of thought: thecapability-based approach[5, 21, 8, 6] and the lter-based approach [2, 3, 4]. Both advertise to enable a receiver to control the trac it receives, butdier signicantly in methodology. The capability approach proposes to leta receiver explicitly authorize the trac it desires to receive, while the lterapproach proposes to let a receiver install dynamic network lters that blockthe trac it does not desire to receive. Advocates of lters have argued that capabilities are neither sucient nor necessary to combat DoS [9], whileproponents of capabilities strongly disagree [6]. In this section we are going to explore the most relevant solutions thatwere developed in the last years.2.2.1 Pushback (2002)Pushback is a cooperative mechanism that can be used to control an aggre-gate upstream. The core idea behind Pushback is to congures routers torate limit the attacker as close is possible to the source. As it is shown ingure 2.4 the congested router asks its adjacent upstream router to rate-limitthe aggregate of trac that seems to be malicious. Figure 2.4: Pushback takes rate-limiting closer to the source(s). This request is sent only to the neighbors that send more trac similarto the detected aggregate, then also the neighbors can propagate Pushbackrecursively to the upstream routers. The most relevant benets introduced with this solution are:
  • 24. 2.2. SURVEY OF DDOS COUNTERMEASURE 15 • Its very powerful when the source address cant be trusted. In that case indeed the congested router cannot narrow the congestion signature by itself. • Very good when the attackers are collocated on a path separate from the legitimate trac.The main weaknesses of Pushback are: • Pushback does not completely block attack trac[2]. Instead, it aims to rate limit the attack trac to its fair share of bandwidth. • It absolutely needs high collaboration across router, so this is hard to enforce in a realistic scenario as Internet. • How to nd the perfect threshold as period of time in which the lter on the aggregate is not needed anymore. • When the attack is uniformly distributed, it cannot detect the aggre- gate of the attack. • If the attack can be sent from hosts on the same network of the victim, push back has no eect on it. • Cant work in non-contiguous deployment. • If the sources IP addresses of the incoming request is not trust also the aggregate upstream can be dicult to be well-determined. In this case only the target could be trusted. The resulting lters then can only be pushed halfway through the network between the target and the sources of the attack.In conclusion the ltering techniques of Pushback are really eective only ifthe lters can be putted close to the source of the attack.2.2.2 Web-SOS (2004)Secure Overlay Services (SOS) is a network overlay mechanism designed tocounter the threats posed by Distributed Denial of Service attacks (DDoS). WebSOS, is an adaptation of SOS for the Web environment that guar-antees access to a web server that is targeted by a DDoS attack. The ap-proach of WebSOS exploits two key characteristics of the web environment:its design around a human-centric interface, and the extensibility inherent inmany browsers through downloadable applets . The goal of this solution is
  • 25. 16 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURESto guarantee access to a web server for a large number of previously unknownusers, without requiring preexisting trust relationships between users and thesystem. WebSOS acts as a distributed rewall eliminating communication pinch-points and thus preventing the Target Servers Routers from being congested.Clients need to connect securely to an Access Point (Figure 2.5) and theoverlay network route him to the actual Target Server. The Target serverallows only the secret servlet (or a set of secret servlets) to connect throughthe ltered Area. The ltering is done using elds that a router can lterfast (e.g. the IP address of the secret servlet). The secret servlets locationcan be varied through time. Figure 2.5: WebSOS ltering techniques Below a consideration that comes from the related pubblicated paper[5]: The disruption in the actual service depends on the number of the secure overlay access points, the resources and distribution of zombies of the actual attacker. The addition of the Graphic Turing Tests allows us to accept non-authenticated trac which is something that most web services require. Additionally Graphic Turing tests (GTT) separate humans from automated attack scripts and allow us more protection against naive automated attacks. Finally GTTs provide the necessary time for the overlay heal from the automated attacks. They prevent trac to penetrate the overlay network and being routed to the Target server thus making the actual Web service more resilient to DDoS attacks.
  • 26. 2.2. SURVEY OF DDOS COUNTERMEASURE 17The main positive points of this solution are summarized in: • Cooperation with other ISP/AS is not necessary; • Autonomous decision for detecting and enforcing the defense mecha- nism; • No changes to protocol, web servers or browsers needed; • Proactive approach to ghting Denial of Service (DoS) attacks; • Overlay can self-heal when a participant node is attacked; • Scalable access control.While the weakness here are: • Considerable overhead introduced: test shows that the latency increase by a factor of 7 in some particular implementation[5]; • GTT can be guess [49, 48]; • Miss legacy solution: Its necessary to install an applet on each client. Then not all the services/scenario of usage are allowed; • Assumes, for security analysis, that no attack can come from inside the overlay; • Assumes that an attacker cannot mask illegitimate trac to appear legitimate; • To improve scalability, the number of SOAPs, Beacons, and Secret Servlets are limited which lessens protection from DoS attacks;2.2.3 DefCOM (2003-2006)DefCOM[18] provides added functionality to existing defenses so they cancollaborate in DDoS detection and response. The nodes in the DefCOMoverlay are classied into three categories, based on the functionality theyprovide during the attack: 1. Alert generator. The functionality is deployed on a native node (router) close to the victim, which may have any sophisticated attack detection mechanism. It receives an attack detection signal from its native node and sends the alert message to all other DefCOM nodes through the
  • 27. 18 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURES overlay. The alert contains the IP address of the attacks victim and species a desired rate limit, e.g., the size of the victims bottleneck link. 2. Classier. The functionality is deployed on a native node close to the source, which can perform trac dierentiation. A classier receives packets from its native node, stamped with the legitimate or suspicious mark it negotiated with this node. It re-stamps the packets with stamps it negotiated with its peers, which ensures priority handling for stamped trac downstream. 3. Rate-limiter. The functionality is deployed in a native node (router), which performs a weighted fair share algorithm for prioritization be- tween legitimate and suspicious trac class, and to drop unstamped trac. The rate limiter reclassies trac it receives based on incoming trac stamps and the amount of trac received from each peer. Trac is reclassied as legitimate, suspicious or unstamped and it is given to the weighted fair share algorithm.Each native node can deploy one or more functionalities, depending on itsresources and the authorization within the overlay, but the placement ofnodes facilitates some functionalities better than others. Below we itemize the main benets of this solution: • Hybrid solution that can be improved and upgraded by design (add-on friendly); • Can lead to a wide range of DDoS threat (depending on the included kind of defense mechanism); • wide deployment (source-end/core-network/victim-end); • Classiers and rate limiters are active only during an attack; • The stamp on a packet (High/Low priority) its periodically negotiated with other peers; • Security: Global CA for joining the peer network of DefCOM; Messages has the signature of the owner.On the other hands DefCOM suers on these weakness:
  • 28. 2.2. SURVEY OF DDOS COUNTERMEASURE 19 • Filtering of trac is not ne-grain (HIGH/LOW priority); • The active testing for detect if HIGH-stamped trac is really legiti- mate, is simply based on a congestion responsive check. Because of that for some scenario is not eective. • Legitimate trac from legacy networks competes with attack for band- width.2.2.4 Speak-Up: DDoS Defense by Oense (2006)Speak-up[21] is a defense against application-level distributed denial-of-service(DDoS), in which attackers cripple a server by sending legitimate-looking re-quests that consume computational resources (e.g., CPU cycles, disk). Withspeak-up, a victimized server encourages all clients, resources permitting, toautomatically send higher volumes of trac. Speak-up is based on the as-sumption that attackers are already using most of their upload bandwidth socannot react to the encouragement. Good clients, however, have spare up-load bandwidth and will react to the encouragement with drastically highervolumes of trac. The intended outcome of this trac ination is that thegood clients crowd out the bad ones, thereby capturing a much larger fractionof the servers resources than before. The experiment described in the paper [21] demonstrate that speak-upcauses the server to spend resources on a group of clients in rough proportionto their aggregate upload bandwidth. While the authors of this originalsolution declare that the test result makes the defense viable and eectivefor a class of real attacks we denitely disagree. Indeed a scenario in which attacks came from a botnet of compromisedweb server like in [62] with Speak-up active could be even more dangerous incomparison with it disabled. Its obvious that the upload available bandwidthof web servers is incredibly bigger than those of common clients.2.2.5 OverDoSe: A Generic DDoS Protection Service Using an Overlay Network (2006)OverDoSe uses a computational puzzle scheme to provide fairness in therequest channel. When a client wishes to connect to a server, it rst sendsa request to a name server to resolve the IP address of the server (step 1 inFigure 2.6).
  • 29. 20 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURES Figure 2.6: OverDoSe basic protocol The name server returns a list of IP addresses of overlay nodes (step 2 inFigure 2.6). The client selects an overlay node to which it sends a connectionrequest (step 3 in Figure 2.6). In response to a clients connection request, theoverlay node replies with the latest puzzle seed released by the server, as wellas a puzzle diculty level specied by the server (step 4 in Figure 2.6). Theclient is expected to solve a 3 puzzle at or above the specied diculty level inorder to successfully set up a connection. The client generates a puzzle basedon the puzzle seed, solves the puzzle, and sends the puzzle solution to theoverlay node (step 5 in Figure 2.6). At this point the overlay node validatesthe puzzle solution and forwards the request and the solution to the server(step 6 in Figure 2.6). The server assigns a cookie to the requesting client,and replies to the overlay node with the cookie and a ow specication (step7 in Figure 2.6). The ow specication is a set of rules the overlay node mustenforce for regulating an established ow. The ow specication is updateddynamically by the server. The overlay node then replies to the client withthe cookie, successfully completing connection setup. The client attachesthe cookie to all subsequent packets to the server. The overlay node thenroutes trac between the client and the server, and polices the clients owaccording to the ow specication.
  • 30. 2.2. SURVEY OF DDOS COUNTERMEASURE 212.2.6 Portcullis (2006)Portcullis aims to provide a defense against large-scale DDoS attacks: evenwhen under attack, a legitimate sender can successfully initiate a connectionwith the receiver and communicate with low packet loss. Portcullis is basedon a new puzzle-based protection for capability request packets. Hence, themain goal of the solution is to design a Denial-of-Capability resistant re-quest channel for a capability system. This design is based on computationalpuzzles, which we prove can provide optimal fairness for the request channel.As a result, Portcullis strictly bounds the delay a collection of attackers canimpose on legitimate clients. To achieve per-computation fairness Portcullisleverage a novel puzzle based mechanism, which enables all routers to easilyverify puzzle solutions and uses the existing DNS infrastructure to dissem-inate trustworthy and veriable fresh puzzle challenges. By enforcing per-computation fairness in the request channel, Portcullis severely limits theattackers ooding rate. The Achilles heel of Portcullis as well as other capability proposals, isthat it uses puzzle, and we will describe more in details in paragraph 3.2.2the weakness of these kind of solutions.2.2.7 Stop-It (2008)Stop-it is designed as lter-based DoS defense architecture. StopIt employs anovel closed-control and open-service architecture to combat various strate-gic attacks at the defense system itself, and to enable any receiver to blockthe undesired trac it receives[2]. StopIt is resistant to strategic lter ex-haustion attacks and bandwidth ooding attacks that aim to prevent thetimely installation of lters. Figure 2.7: StopIt architecture
  • 31. 22 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURES The architecture of Stop-it is shown in gure 2.7 where a dashed circlerepresents an AS boundary. The gure depict also the steps to install aStopIt lter. First when a destination Hd detects the attack trac from a source H s,it invokes the StopIt service (Router Rd ) to block the attack ow (H s ,H d )for a desired period of time. In the second step the access router veriesthis request to conrm that the source Hs is attacking the destination Hdand sends a router-server request to the ASs StopIt server S d. At point 3 ofgure 2.7 the StopIt server Sd in the destination H d s AS forwards an inter-domain StopIt request to the StopIt server SS in the source H s s AS to blockthe ow (H S ,H d ) for the chosen period of time. In step 4 of gure 2.7 thesource StopIt server SS locates the access router RS of the attacking sourceHS, and sends a server-router request to the access router. A StopIt serverignores inter-domain StopIt requests that block itself to prevent deadlocks. Inthe last step (5 of gure 2.7) the access router RS veries the StopIt request,installs a lter, and sends a router-host StopIt request to the attacking sourceHS. After receiving this request, a compliant host HS installs a local lterto stop sending to H d. The StopIt mitigation system focuses basically on two families of DDoSattacks: Destination Flooding Attacks and Link Flooding Attacks. In the Destination Flooding Attacks the Attackers send trac oods to adestination in order to disrupt the destinations communications. The LinkFlooding Attacks instead aims to congest a link and disrupt the legitimatecommunications that share that link. The main eorts of the StopIt service are: • Possibility of a wide incremental deployment; • It includes a mechanism for prevent IP-Spoong[20] (trough BGP an- nouncements and ingress ltering); • Highly scalable.On the other hands StopIt suers from the following weaknesses: • If the attack does not reach the victim, but congests a link shared by the victim, the lter are not installed. In this scenario a capability based system outperforms Stop-IT. • Sensible to uniformly distributed attacks: in such scenario the lters cannot be implemented by design. In a realistic scenario in which a botnet is uniformly distributed, bots could be detected as legitimate.
  • 32. 2.2. SURVEY OF DDOS COUNTERMEASURE 232.2.8 TVA: Trac Validation Architecture (2009)Trac Validation Architecture (TVA)[8] is a network architecture that limitsthe impact of Denial of Service (DoS) oods from the outset. TVA is basedon the notion of capabilities like SOS and Speak-UP [5, 21]. In TVA insteadof sending packets to any destination at any time, senders must rst obtainpermission to send from the receiver, which provides the permission in theform of capabilities to those senders whose trac it agrees to accept. Thesenders then include these capabilities in packets. This enables vericationpoints distributed around the network to check that trac has been autho-rized by the receiver and the path in between, and hence to cleanly discardunauthorized trac. TVA addresses a wide range of possible attacks againstcommunication between pairs of hosts, including spoofed packet oods, net-work and host bottlenecks, and router state exhaustion. The main benets of TVA are: • Scalability: Incrementally deployable in the current Internet. • No changes to Internet or legacy routers are needed. • Use path identier for mitigate the problem of IP spoong. • Fine-grain capabilities. • Legitimate users are isolated from the attack trac through hierarchi- cally fair queues.The main weaknesses of TVA instead are: • The path identier mechanism is also vulnerable to tags forging. • vulnerable to ood of authorized trac: it simple share the bandwidth for all using a fair-queuing approach based on the IP of the destination. If the number of attackers is larger than the legitimate users the services will be liable to the DDoS. • Client with low request rate cannot be well protected due to the router queues like in every capability based system [19].2.2.9 DDoS-Shield (2009)DDoS-Shield[22] is a counter-DDoS mechanism to protect the applicationfrom layer-7 DDoS attacks. These attacks mimic legitimate clients and over-whelm the system resources, thereby substantially delaying or denying service
  • 33. 24 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURESto the legitimate clients. Its main goal is to provide adequate service to le-gitimate clients even during the attack. The defense model of DDoS-Shieldconsists of mitigation system integrated into a reverse-proxy that schedulesor drops attack requests before they reach the web-cluster tier. The DDoS-Shield examines requests belonging to every session, parses them to obtainthe request type and maintains the workload- and arrival- history of requestsin the session. Figure 2.8: DDoS-Shield Model In gure 2.2.9 is shown the system architecture for DDoS-Shield thatconsists of: 1. Suspicion assignment mechanism that uses session history to assign a suspicion measure to every client session; 2. DDoS-resilient scheduler that decides which sessions are allowed to for- ward requests and when depending on the scheduling policy and the scheduler service rate. The main idea behind DDoS-Shield is really interesting and it inspiredmost part of our proposal. By the way, after analysing this solution we thinkto prone the following issues:legitimate proles poisoning: They dont incorporate the attack where the attacker sends a large number of completely normal sessions (same session- arrival rate, same workload prole and same request arrival rate.legitimate proles poisoning if we consider an always- or frequently- con- nected botnet that lightly injects the same - or a set of similar session pattern - the learning system starts to give a low level of suspicion
  • 34. 2.3. RELEVANT COLLABORATIVE INFRASTRUCTURE SYSTEMS 25 to this kind of trac. It means that when the attacker decides that its time to start the attack, the trac generated by botnet might be detected as a low rate suspicion. Such kind of requests take an high priority in the scheduler dispatching policy. In this scenario the power of the attack can be even worser than without using that mitigation technique.Wide-spread Legitimate session when all bots are submitting the same or a similar sequence of legitimate session that can be indistinguish- able from a humans generated one. The power and broadness of a large botnet can confuse the detection system used in DDoS-Shield then by- pass its defenses.2.3 Relevant Collaborative Infrastructure Sys- temsHereby we will present the main collaborative infrastructures that have beentaken into account in our study, even if none of them make part of the projectsspecically dedicated to DDoS mitigation. We take this solution in consideration, because we tough fundamentalto combat cyber crime sharing reports regarding the recorded attacks onits infrastructure. From received attacks it is possible to extract a lot ofinformation, that may be useful to understand the dynamics behind them.Thanks to sharing this information between trusted partners it is possibleto form a basic that permits to strengthen their defenses and to be betterprepared against future attacks. The rst project presented here is CoMiFIN, which purpose is exactlythe one described above, restricted to the world of Financial Institution (FI).Then comes a paragraph about WOMBAT - another project funded by theEuropean community, that similarly aims to provide tools to analyze andunderstand the existing and emerging threats targeting the Internet economyand the net Citizens. The third paragraph is about public project DShield, that like WOMBAT,consists of a network of sensors that collect information about anomalies inthe Internet to a central database. Finally, a brief description of FIRE - a service to identify rogue networksand Internet Service Providers close the chapter.
  • 35. 26 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURES2.3.1 CoMiFINCommunication Middleware for Monitoring Financial Critical Infrastructureis an EU project funded by the Seventh Framework Programme (FP7),started in September 2008 and continuing for 30 months. The research areais Critical Infrastructure Protection (CIP), focusing on the Critical FinancialInfrastructure (CFI). CoMiFin does not perturb or require any changes to existing client infras-tructures or proprietary networks. It is an add-on middleware layer struc-tured as a stack of overlay networks built on top of the Internet in orderto exploit Internet business continuity. The system facilitates the sharing ofcritical events among interested parties, which can in turn use these eventsto trigger the necessary local protection mechanisms in a timely fashion. Ascheme of the communication structure in CoMiFIN is shown in gure 2.9. Figure 2.9: CoMiFIN Framework Subset of participants are commonly grouped in federations. Federationsare regulated by contracts and they are enabled through the Semantic Roomabstraction: this abstraction facilitates the secure sharing and processingof information by providing a trusted environment for the participants to
  • 36. 2.3. RELEVANT COLLABORATIVE INFRASTRUCTURE SYSTEMS 27contribute and analyze data. Input data can be real time security events,historical attack data, logs, and other sources of information that concernother Semantic Room participants. Semantic Rooms can be deployed ontop of an IP network allowing adaptable congurations from peer-to-peer tocloud-centric congurations, according to the needs and the requirements ofthe Semantic Room participants. A key objective of CoMiFin is to provethe advantages of having a cooperative approach in the rapid detection ofthreats. Specically, CoMiFin demonstrates the eectiveness of its approachby addressing the problem of protecting nancial critical infrastructure. Thisallows groups of nancial actors to take advantage of the Semantic Roomabstraction for exchanging and processing information. Some of these shared information may include: • Technical: Fault Notication • Service Related: Interruptions, Updates • Infrastructure: Power, Network Faults • Security: Threat Notication, Phishing info, Detected Frauds, Intru- sions, DoS Attacks • Others: General WarningsThis information is consumed by the event processing engines installed ateach participating site. These event engines leverage the computing andstorage capabilities available in the local data center to discover maliciousattack patterns and other impending threats in a timely fashion. Such sharing of information raises trust issues with respect to the infor-mation owing in the SR. There can exist dierent types of SRs with dierentlevels of trust requirements. At one extreme there could be SRs formed bynancial institutions that trust each other implicitly (e.g., branches of thesame bank), and consequently trust the information being processed andshared in those SRs. At the other extreme, there could be SRs whose mem-bership includes participants that are potential competitors in the nancialmarket. In this case, the issue of trusting the information circulating ina semantic room becomes a point of great importance and if this issue isnot adequately addressed, the semantic room abstraction will be infeasibleas nancial institutions will refrain from becoming members of it. For in-stance, processed data in the SR related to DDoS attacks on FIs can beused by a more specialized SR such as DDoS attacks on banks in a countrywhose data can be, in turn, used by the SR related to DDoS attacks on a
  • 37. 28 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURESspecic bank in a specic country in order to provide partners with richerservices[42] . A possible scenario of usage of the shared events information could be togenerate fast and accurate intruder blacklist. The philosophy of sharing suchkind of critical information as in CoMiFIN[31] gaves us a lot suggestion onthe designing phase of our proposal.2.3.2 WOMBATThe WOMBAT project is a collaborative European funded research projectthat aims at providing new means to understand the existing and emergingthreats that are targeting the Internet economy and the net citizens. Theapproach carried out by the partners include a data collection eort as wellas some sophisticated analysis techniques. The Leurré.com project was initially launched in 2003 and has since thenbeen integrated and further improved (SGNET) and developed within theWOMBAT project. It is based on a worldwide distributed system of honey-pots running in more than 30 dierent countries covering the ve continents.The main objective with this infrastructure is to get a more realistic pictureof certain classes of threats happening on the Internet by collecting unbiasedquantitative data in a long-term perspective. WOMBAT records all packets sent to its sensor machines, on all plat-forms, and it stores the whole trac into a database, enriched with somecontextual information and with meta-data describing the observed attacksessions. A simple example of the interaction between some components ofWOMBAT is shown in gure 2.10.
  • 38. 2.3. RELEVANT COLLABORATIVE INFRASTRUCTURE SYSTEMS 29 Figure 2.10: SGNET framework A source is dened in SGNET/WOMBAT as the contiguous activityon an IP address. Due to the bias introduced by dynamic addressing an IPaddress cannot be generally reliably considered as a good way to identify anattacker. There are a number of situations for which IP A.A.A.A at day XIs likely to be associated to a completely dierent machine at day Y. For this reason, they dene a source as the activity of a given IP addresswhen its not separated by a long silence time. If IP A.A.A.A is observedattacking one of them honeypot sensors, and then is silent for more than 25hours, and then is witnessed again, its considered as a dierent source sincewe have no evidence that we are still dealing with the same machine. The meta-data collected in the database help to dene what they callattack events [38], a representation of specic activities over limited periodof times. The attack events enables to observe the evolution of what theyhypothesize to be armies of zombies, some of them remaining visible for morethan 700 days. The attack events highlights the existence of coordinated attacks launchedby a group of compromised machines, i.e. a zombie army. They computeaction set as a set of attack events that are likely due to same army. Throughthe action set they are able to derive the size and the lifetime of the zombiearmies. The researchers involved in the WOMBAT project present a new attackattribution method [37]. This analytical method aims at identifying largescale attack phenomena composed of IP sources that are linked to the same
  • 39. 30 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURESroot cause. All malicious sources involved in a same phenomenon constitutewhat they call a Misbehaving Cloud (MC). When an huge amount of data are collected, a big issue remain on theway its possible to get useful information out of the database. For solvingthis problem it was developed a set of API for query on WOMBAT database. WAPI is a set of API developed by the project partners of WOMBAT thatallow an integrated access to dierent attack dataset. Every single datasethas his own maintainer that handles certicates for accessing the database. We think that the information coming from such huge database could bea powerful instrument useful for a lot of dierent investigation scenarios.2.3.3 DShieldDshield is a community-based collaborative rewall log correlation system.It receives logs from volunteers world wide and uses them to analyze attacktrends. It is used as the data collection engine behind the SANS InternetStorm Center (ISC). It is one of the public most dominating attack correlationengine with worldwide coverage. DShield is regularly used by the media tocover current events. Analysis provided by DShield has been used in the earlydetection of several worms. DShield data is regularly used by researchers toanalyze attack patterns. The goal of the DShield project is to allow accessto its correlated information to the public at no charge to raise awarenessand provide accurate and current snapshots of internet attacks. Several datafeeds are provided to users to either include in their own web sites or to useas an aide to analyze events. Sites such as not only compile global worst oender lists(GWOLs) of the most prolic attack sources, they regularly post rewallparsable lters of these lists to help the Internet community ght back.DShield represents a centralized approach to blacklist formulation, with morethan 1000 contributors providing a daily perspective of the malicious back-ground radiation that plagues the Internet. The published GWOL capturesa snapshot of those class C subnets whose addresses have been logged bythe greatest number of contributors. Another common practice is for a localnetwork to create its own local worst oender list (LWOL) of those sites thathave attacked it the most. LWOLs have the property of capturing repeatoenders that are indeed more likely to return to the local site in the fu-ture. LWOLs are by denition completely reactive to new encounters withpreviously unseen attackers. The GWOL strategy, on the other hand, hasthe potential to inform a local network of highly prolic attackers, it also hasthe potential to provide a subscriber with a list of addresses that will simplynever be encountered.
  • 40. 2.3. RELEVANT COLLABORATIVE INFRASTRUCTURE SYSTEMS 31 Both of these dierent approaches are a possible solution to the sameproblem encountered in WOMBAT regarding a way for extract meaningfuldata from such huge database. To this end a techniques called HPB[40](Highly Predictive Blacklisting) was developed.Highly Predictive Blacklisting Highly Predictive Blacklisting is a dier-ent approach to blacklist formulation in the context of large-scale log sharingrepositories, such as DShield. The objective of HPB is to construct a cus-tomized blacklist per repository contributor that reects the most probableset of addresses that may attack the contributor over a prediction windowthat may last several days. The HPB enumerate all sources of reported at-tackers and assign each of them a ranking score relative to its probabilityto attack the contributor in the future. The ranking score is based on ob-servation of the particular attackers past activities, as well as the collectiveattack patterns exhibited by all other attackers in the alert repository. Thisis another key dierence between our HPB algorithm and the other blackliststrategies. In the compilation of GWOL and LWOL or their like, each black-list entry is selected solely based on its own attack history. HPB strategy, incontrast, takes a collective approach. HPB attacker selection is inuenced byboth an attackers own attack patterns and the full set of all attack patternsfound within the dataset. A correlations among the contributors introducedby the collection of attackers is here introduced, i.e., the amount of attackeroverlap between each pair of contributors. The ranking score in HPB caresnot only how many contributors have reported the attacker but also whogave the reports. It favors attackers reported by many contributors that arealso correlated (have many common attackers) with the contributor underconsideration. The contributor correlation used in HPB is inspired by a workof Katti et al [29].2.3.4 FIRE: FInding RoguE NetworksFIRE[41] is a novel system to identify and expose organizations and ISPsthat demonstrate persistent, malicious behavior. FIRE can help isolate net-works that tolerate and aid miscreants in conducting malicious activity onthe Internet. To make this possible, FIRE actively monitors botnet commu-nication channels, drive-by-download servers, and phishing web sites. Thisdata is rened and correlated to quantify the degree of malicious activity forindividual organizations. With respect to root cause analysis, these resultscan be used to pinpoint and to track the activity of rogue organizations,preventing criminals from establishing strongholds on the Internet. Also, the
  • 41. 32 CHAPTER 2. DDOS AND EXISTING COUNTERMEASURESinformation can be compiled into a null-routing blacklist to immediately halttrac from malicious networks.
  • 42. Chapter 3Models Overview3.1 Attacker and Victim ModelIn this section we describe rst the victim model, in other words, the targetsystem that we assume. Second we describe the attacker model: the strate-gies used by the attacker to make the attack successful. Finally we describeour proposed defense model.3.1.1 Victim ModelWe consider a general victim model as a multiple resources pool of servers. Inour experiments, we focus on a Portal Content Management System (CMS)application hosted on a web cluster, which consists of multiple-tiers for pro-cessing requests as shown in g. 3.1. In the gure it is possible to see all the tiers of the architecture. Atthe top there is a Border Router that is commonly used within a companysinfrastructure as components between Internet and the internal network. Below the Border router a Load balancer has the duty of balancing therequests that need to be forwarded to the replicated WebServers. As WS tierwe intend the layer of the WebServer Software, i.e Apache, IIS, Nginx. We dene a portal session as HTTP/1.1 session over a TCP socket con-nection that is initiated by a client with the web server tier. The HTTP/1.1sessions are persistent connections and allow a client to send requests andretrieve responses from the web-cluster without incurring the overhead ofopening a new TCP connection for request. A legitimate HTTP/1.1 sessionconsists of multiple requests sent during the lifetime of the session. These re-quests are either sent in a closed-loop fashion, i.e., the client sends a requestand then waits for the response before sending the next request. Otherwise 33
  • 43. 34 CHAPTER 3. MODELS OVERVIEW Figure 3.1: Target Architecturethey are pipelined, i.e., the client can send several requests without wait-ing for their response and thus have more than one pending requests on theserver. A page is typically retrieved by sending one main request for thetextual content and several embedded requests for the image-les embeddedwithin the main page. The main requests are typically dynamic and involveprocessing at the database tier while embedded requests are usually static asthey involve processing only at the web-cluster tier. Every tier in the system consists of multiple limited resources, such as:computation, storage and network bandwidth. We assume that all tiers mon-itor continuously the resources in the tier and generate periodically resourceutilization reports as well as overall system statistics at the application layer,such as throughput and response time. Its said that the system is under aresource attack when a surge in a resources usage is accompanied by reduc-tion in throughput and has an increase in response time without a DDoSattack detected at lower layers.
  • 44. 3.1. ATTACKER AND VICTIM MODEL 353.1.2 Attacker ModelMost powerful attack scenarios are an extension of those in DDoS-Shield[22]and described below. The goal of the attacker is to overwhelm one or more server resourcestherefore legitimate clients experience high delays or lower throughput therebyreducing or eliminating the capacity of the servers to its intended clients. Theattacker uses the application interface in order to issue requests that canmimic legitimate client requests, but whose only goal is to consume serverresources. We assume that the application interface presented by the serversis known (e.g., HTTP, XML, SOAP) or can be discovered (e.g., UDDI[58]or WSDL[57]). We consider session-oriented connections to the server, e.g.,HTTP/1.1 session on a TCP connection with the server. We assume that theattacker has commandeered a large number of machines distributed acrossdierent geographical areas, organized into server farms known as botnets.To start a TCP session, an attacker can either use the actual IP address of thecompromised machine or spoof that address with one chosen from botnetsaddresses. Based on the workload parameters that the attacker can leverage to exe-cute layer-7 attacks, we divide these attacks into the following three classes,according to DDoS-Shield[22]:1) Request Flooding Attack: where each attack session issues requests at an increased rate as compared to a non-attacking session.2) AsymmetricWorkload Attack: where every attack session sends a hi- gher proportion of requests that are more taxing on the server in terms of one or more specic resources. The request rate within a session is not necessarily higher than normal. This attack diers from the request-ooding attack in that it causes more damage per request by selectively sending heavier requests. Moreover, this attack can be in- voked at a lower request rate, thereby requiring less work from the attacker and making detection increasingly dicult.3) Repeated One-Shot Attack: it is a degenerate case of the asymmetric workload attack, in which the attacker sends only one heavy request in a session instead of sending multiple heavy requests per session. Thus, the attacker spreads its workload across multiple sessions instead of across multiple requests in a few sessions. The main benets of this spreading is that the attacker is able to evade detection and potential service degradation to the session by closing it immediately after send- ing the request. The asymmetric request ooding attack and its vari- ants exploit the heterogeneity in processing times for dierent request
  • 45. 36 CHAPTER 3. MODELS OVERVIEW types. The attacker can obtain the information about server resources consumed by dierent legitimate request types through monitoring and proling.It is not really relevant if the user is logged into the system or not, as well asit also can be connected through HTTPS authenticated session. A possiblescenario here is when only a subset of the botnets nodes is connected beforethe real attack, and only when the attacker wants to launch the DDoS, allthe nodes are used, (maybe for a very short period of time). This situationcan confuse a large number of defense techniques that are based on the de-tection of peaks of never seen before clients, and in general all the solutionsthat tag under attack as legitimate the clients that are connected when thesystem is not under attack. We assume that the worst case scenario is when the attacker knows the fullproling data, and therefore can select requests that maximize the amountof server resources consumed. However, in general, this type of informationcan only be obtained through proling and timing the server responses fromoutside. For instance, to obtain the average server processing time per re-quested page, the attacker uses a web-crawler to obtain the total (network+ server) delay in processing the request. We assume that each bot is clever enough to solve every kind of puzzletest either directly or through a forwarding system[48]. In the previously presented model of attack the requests can be generatedin dierent manners, and we generalize them in: 1. Frantic Crawler: An automatic software running on single or multiple distributed nodes, that follows every link it nds, or a subset of them. 2. Cloned Legitimate Recorded Session: A sequence of requests that can come from a recorded legitimate session, and that is forwarded to each member of the botnet. 3. Randomized Legitimate Recorded Session: Like a legitimate recorded session, but more smart. It includes random noise inside the session like random mouses movements from the origin to the target position, random links following, random elds lling.3.1.3 Defense ModelWe introduce a new collaborative counter-DDoS mechanism to protect theapplication from layer-7 DDoS attacks and provide adequate service to legit-imate clients even during the attack.
  • 46. 3.2. DEFENSE STRATEGIES 37 Our defense model consists of dierent components that all together im-prove the reliability of the Web application during the attack. The corecomponents here are: the Smartproxy, Advanced Deep Logger, and the Rep-utation Database. The smartproxy prioritizes the incoming request on the base of the trustlevel of the source that generates it, before they reach the web-cluster tier.The Advanced Deep Logger is a set of tools that improve the collection ofthe information which are typically stored on the Web Server logs, that makeeasier the auditing process during and after the attack. The Reputational Database is owned and updated by a group of trustedpartner that want to share useful information of malicious clients that havetried to attack his own systems. In this database all the information aboutthe trust level of the malicious clients are collected and continuously updatedwith the information coming from the most recent attacks. Information onthe trust level might also be aected by attacks information coming fromtrusted third-party entities. Even if the situation described above is just asimple attack attribution techniques, we had to make some assumptions inour model for logging the real source of the request and not just a spoofed one.There are some existing techniques that can detect whether the IP is spoofedor not and we assume that they are used on the network infrastructure ofthe ISP or own network before the webserver (as described in [24, 25]). Further we will extend the detection process used in DDoS-Shield[22] bychecking periodically how many IP sources (that make part of our database)are connected, and nd a threshold that can suggest our system to improvethe level of ltering. For example, we can achieve it being more restrictedwith the resources given to clients in the scheduler policy, or by asking tolter the malicious IP as far as possible from the web server.3.2 Defense StrategiesThe studied solution is mainly based on three phases: the detection of theDDoS on the protected system, the classication of the incoming requestsbetween legitimate and (possible) malicious trac, and the response againstthe detected attack.3.2.1 DetectionThe detection of a DDoS attack is not easy and has a lot of dierent ap-proaches. A lot of dierent approaches are mainly based on an anomalydetection by the trac distribution or volume [45, 33], or by the detection of
  • 47. 38 CHAPTER 3. MODELS OVERVIEWalready known attacks signatures[46, 47]. Proling a legitimate behavior isnot easy, as well we cannot know every attacks signature since there will bealways a new one, never seen/known before. Both of these approaches arenot helpful in our attacker model, considering that we assume the attackersbehavior as a well-chosen human generated sequence of actions. Since thiscan be a real scenario (submitted from a legitimate user) it must not bedetected as a bad behavior, maybe only because it is far from the comparedlegitimate proles. Otherwise it means that the sets of good proles are toosmall to allow possible application usage. In other words, it will reect in anhigher number of detected false negatives. A statistical system, which bases its decision on a starting training set,usually has a trade-o on how exible it should be. We can split such kindof statistical system in two main families: o-line and on-line. In the rst (o-line) case the training set is commonly built in a customand safe environment (usually used for testing and production). This setupmakes impossible to put bad data on it, but that means also that some goodbehavior (for anomaly-based systems), or attackers signature (in the case ofsignature-based systems), is not included in the initial training set because itis impossible to cover all these kinds of behavior. This issue will be inevitablyreected in detection of false positives or negatives. On the other hand, the on-line family is prone to poison attacks, in thesense that the attacker has a chance to inject customized data to the systemuntil they become a part of the legitimate training set. Thats also theproblem of DDoS Shield, that we want to improve and extend. Our approach for detection is based more on the user perception than onthe variation on the QoS like in [16] but we extend this process monitoringalso the workload of the single critical components of our architecture (CPU,Disk, Memory Load) and thanks to a new metric, that we propose, based onthe number of suspected clients concurrently connected to our system in acertain time frame.3.2.2 ClassicationOne of the main problems to distinguish legitimate from malicious tracis commonly reduced to the problem of distinguishing DDoS from a Flash-Crowds. During the last years the bots interaction with the application hasbecome very close to the human behavior. This issue makes very dicult todistinguish bots from humans. As we have already discussed the leaks of statistical approaches in thedetection of a DDoS, such kind of system may suer the same issues in theclassication process. In fact, distinguish legitimate users from malicious
  • 48. 3.2. DEFENSE STRATEGIES 39trough a statistical system at Layer-7 without detecting false positives ornegatives is a big challenge. One of the main alternatives to distinguish humans from bots is to giveto the clients a reverse Turing test to solve, i.e. captcha or more in general apuzzle-based test. Unfortunately, there are a lot of studies about techniquesthat makes these tests solvable by a bot. When the puzzle is too hard to besolved in an automatic way, the attacker can also hire people for solving that(slow approach but eective anyway in spamming scenarios), or it can simplyforward this information to popular website that oers resources illegallylike movies, music and copyrighted software. Since the only cost that isrequired for accessing this (expensive) resources is to solve an easy puzzle,every user that is interested in such kind of resources becomes unwittingaccomplice of the DDoS attack. Then it doesnt matter how hard are thesepuzzles to be solved. Puzzle-based solutions are commonly used in capability systems, in whichafter some active (i.e. puzzle) or passive analysis the client gets a booleanaccess to the protected system. Unfortunately, this approach can be verydangerous because the bots are becoming really smart and if they are ableto solve the puzzle, as we have described above, in proposed solutions theyget a capability to access the system. They can access the system until thecapability expires. We think that the best approach is to use a ne-grain level of suspicionthat has to be assigned to suspected clients. That allows our system to reducethe problems related to the detection of false positive clients like in [22]. Butinstead of classifying clients thanks to a comparison with some good proles,we rely on information in a collaborative and shared database, in whichpartners of the collaborative infrastructure can share information about thesources of detected attacks and from which they can get information aboutthe suspicion level of clients. To improve the database and to reduce the detection of false negatives, weuse some known automatic techniques, like crawler trap for detecting unfaircrawler or some randomized bots actions. In order to improve the detectionof other unknown attacks we increase the collected data from the users, inthe logs, with other information, like the users actions, keystrokes, mousemovements. Thanks to this information we aid an auditor in perpetrating aforensic analysis for detecting malicious actions and the sources from whichthat was generated.
  • 49. 40 CHAPTER 3. MODELS OVERVIEW3.2.3 ResponseAfter having classied the incoming trac, the common approaches to handlethe malicious one are to redirect or to drop it. Dropping the trac couldbe dangerous in case of false positive detected, as well as redirect trac forfurther analysis can be interesting like in Roaming Honeypots [44] for furtheranalysis. In our scenario, in which the attacker has a high interaction withthe applications, we cannot be absolutely sure about the legitimate/maliciousintention of the user from the beginning. In fact, as we have discussed before,giving a test to all the clients, or to their suspected subset, help mitigate ourattacker model since we assume that the attacker is protocol compliant andis able to solve these kind of tests. Similar to DDoS-Shield[22] we think that the best solution is to sched-ule the incoming requests on the base of a ne-grain suspicion level thanksto what we call Smart Proxy, and in case there are not enough resourcesavailable for that suspected clients, to drop these requests. Nevertheless, our suspicion assignment method is completely dierent andit is based on the reputation of the clients. Furthermore, to reduce the tracand computation on the Smart Proxy we use also a Pushback[4] approachon the border router of the company that is hosting the server farm. In thisway, we are able to shape the trac that comes from the already suspectedclients as close as it is possible to the source of the trac, so the Smart Proxycan get a benet from it. At the end of the attack in any case it will be possible to do a forensicsanalysis thanks to actions collected from the clients that can be correlatedwith the data in the collaborative database for speeding-up the auditingprocess and then for nding all the sources of the attack; the submission ofthis data to the database gives the benet to the target as well as to all themembers of the collaborative database to be ready for further attacks fromthat sources.
  • 50. Chapter 4Software ArchitectureIn this Chapter we present rst the logical description of our defense solu-tion proposal, then in the second section we describe the features of everycomponent. 41
  • 51. 42 CHAPTER 4. SOFTWARE ARCHITECTURE4.1 Logical DescriptionIn the picture 4.1 you see the model overview of our solution. In the fol-lowing subsections we describe each aspect of our proposal categorizing it asdetection, classication and response. Figure 4.1: Model Overview
  • 52. 4.1. LOGICAL DESCRIPTION 434.1.1 DetectionWe agree with Mirkovic et al.[16] regarding the importance of measuring theperception of service quality from a clients perspective. As usually humanusers are the rst victim aected by a denial of service on a server, sincethey can perceive a degradation of service due to the DoS. Measuring thisperception is not the main focus of this study and we think that the existingmetrics based on QoS proposed by Mirkovic et al. can be very useful in ourscenario as well. Wed like to summarize it here: A transaction represents a higher-level task whose completion is perceptible and meaningful to a user. A transaction usually involves a single request-reply exchange between a client and a server, or several such exchanges that occur close in time.A transaction is successful if it meets all the QoS requirements of its corre-sponding application. If at least one QoS requirement is not met, a transac-tion is considered failed. Transaction success/failure is the core of the metricsproposed by [16] for measuring the perception of the user. The transaction success/failure measures are aggregated into several in-tuitive composite metrics:Percentage of failed transactions (pft) per application type. This met- ric directly captures the impact of a DoS attack on network services by quantifying the QoS experienced by users. For each transaction that overlaps with the attack, we evaluate transaction success or failure.DoS-hist metric shows the histogram of pft measures across applications, and is helpful to understand each applications resilience to the attack. The DoS-level metric is the weighted average of pft measures for all applications of interest. This metric may be useful to produce a single number that describes the DoS impact but is highly dependent on the chosen application weights and thus can be biased.QoS-ratio is the ratio of the dierence between a transactions trac mea- surement and its corresponding threshold, divided by this threshold. The QoS metric for each successful transaction shows the user-perceived service quality, in the range (0, 1], where higher numbers indicate bet- ter quality. It is useful to evaluate service quality degradation during attacks.Like Mirkovic we compute it by averaging QoS-ratios for all trac measure-ments of a given transaction that have dened thresholds. For failed trans-actions, we compute the related QoS-degrade metric, to quantify severity of
  • 53. 44 CHAPTER 4. SOFTWARE ARCHITECTUREservice denial. QoS-degrade is the absolute value of QoS-ratio of that trans-actions measurement that exceeded its QoS threshold by the largest margin.This metric is in the range [0, +∞). Intuitively, a value N of QoS-degrademeans that the service of failed transactions was N times worse than a usercould tolerate. While arguably any denial is signicant and there is no needto quantify its severity, perception of DoS is highly subjective. Low valuesof QoS-degrade (e.g., 1) may signify service quality that is acceptable tosome users. These metrics can be very useful also for comparing our proposal withother alternatives, but we need more metrics that can be meaningful at adeeper level of our architecture and help us to detect any attempts of attackagainst our infrastructure. Seeing that the target of our attacker model isthe Web Application hosted in our infrastructure, we should take care ofdierent critical components. LoadBalancer, Server, Application and Database are the basic compo-nents that form the tiers, that are commonly necessary for hosting web ap-plications. On the top of a loadbalancer there might be at least anothercomponent directly managed from a company like a border router. Every-thing that comes before/outside the border router doesnt make a part ofour research since it is necessary a level of collaboration inter and intra ISP,AS and so on, that makes this kind of detection approach very hard to beimplemented in a real world: thats because its not possible to trust everysingle entity involved in the communication path. In particular, these are the main points that we have chosen to monitor: 1. CPU utilization of web, database and smartproxy tiers; 2. Memory allocation in web and application tiers; 3. Average throughput in request/second achieved per normal client ses- sions.The CPU workload at webserver and database tier is quite critical due toour attacker model that we suppose submits well-chosen actions for knockingdown the weakest ring of the model. We are also monitoring, in the Suspicion Checking component, the num-ber of suspected (from the acknowledgment of the reputational database)clients concurrently connected to the system. The number of suspiciousclients connected is an indicator that can help us be more aware of howmany suspected clients, in real-time, are concurrently connected to the sys-tem in a certain time frame. We suppose the higher the level of this metric,the higher the probability that the system is under attack, or close to be at-tacked. We suppose that there isnt a threshold that can x all the metrics.
  • 54. 4.1. LOGICAL DESCRIPTION 45That is the main reason of why we allow and suggest to customize it (by anadministrator) according to the average amount of clients that are typicallyusing the web application. Agents are involved in regular monitoring of components to which theyare attached, and in reporting the results of the analysis to the MonitoringFrame. The Monitoring Frame has a task to analyze continuously the reports gen-erated from agents, and in case the alerts thresholds (customizable) exceed- to take the necessary countermeasures.4.1.2 ClassicationWe already highlighted the weak of statistical techniques for the classicationof legitimate and malicious trac at an application layer, and also the prob-lem of puzzle solutions like CAPTCHA in recognizing humans from bots, sonow we would like to describe the best solution from our point of view. Instead of trying to prole the behavior of single client only, during aninitialized session, we think that it is better to handle new request of con-nection, on a trust-based way. The trust rate information of clients is collected into a database handledby a collaborative network of trusted partners. In particular, we can ndinformation regarding the malicious rate of clients that were detected to bepart of a DDoS attack to one of the other partners and we are describingthis process below. In our solution we are proposing some techniques for improving the databasethanks to Automatically Detectable Malicious Behaviors, O-line auditinganalysis and Users activity tracking.Suspicion Assignment Mechanism Our suspicion assignment mecha-nism is mainly based on the long-term reputation introduced here[36], thatis described as:Long-term reputation: it is a reputation that takes into account all info- items about a client, regardless of the item age.Building this kind of reputation is possible thanks to the previous experiencewith each clients on the system, but can be extremely extended thanks tothe exchange of information with other partners. This information is storedin a collaborative database in which sets can be committed directly froma member of the collaborative architecture, as well as from a third-partyproject related to botnet tracking or malware detection like WOMBAT[37].
  • 55. 46 CHAPTER 4. SOFTWARE ARCHITECTURE Every single dataset has a rate level due to its danger (based on numberof detected hits). Every single IP inside it has also its own danger rank. Inthis way it is possible to tackle the problem of getting a non reliable dataset.First of all, if a compromised PC has been sanitized, the IP should not bepresent in the new attacks. Meanwhile the dangers rate of this IP can bedecreased as well. Secondly, if an army dataset has been simply submitted by a maliciousemployer, or by any human mistake, it might be more complicated for otherpartners to detect and verify the same single or subset IP. The collaborativeinfrastructure indeed should increase the trustworthiness rank of the submit-ted dataset: the relative dangerous rate remains in not veried state. The basic infrastructure for sharing these datasets can be an extension ofthe Overlay network used in [30] introducing there a reputation-based systemto handle the dangerous rate of each dataset (usually related to a botnet)and also of every individual IP source.Automatically Detectable Malicious Behavior First we decided todetect some malicious activities in order to prevent the moment when thereal DDoS attack becomes heavier. These activities can be divided into twomain categories: 1. Requests of forbidden resources; 2. Account credential reuse. The rst strategy of attack, described at par.3.1.1, is easily detectablesimply introducing a large number of hidden trap links (also called crawlertraps ) in the exclusion list of robot.txt [59]. In this way we can distinguisha fare crawler from a malicious one. As soon as a link of the hidden sub-set is requested, the Automatic Suspicious components can automaticallysubmit the source of the attack in the central database, tagging it with theappropriate model of attacker detected. There are other automatically detectable attacks like forgery and reuseof session tokens associated to other clients, and multiple connections ofthe same users account from dierent locations (the threshold of maximumnumber allowed should be customized). We assume that the best practices for security development and imple-mentation of applications are applied, in particular, regarding the checkingof the session token steal/reuse or tampering, very common for attacks, like
  • 56. 4.1. LOGICAL DESCRIPTION 47Cross-site request forgery. Its importance we will describe in the Classica-tion section. In the environment, in which it is not possible to make this assumption,it is necessary to add a plug-in in the Automatic Check components that canimprove detection logic. What we are not assuming instead is the possibility that one user issimultaneously connected from dierent locations, that in fact is actuallycommonly allowed in a large number of web applications, since sometimesit is necessary for some specic usage. Since we want to provide exibilityfor that we are simply monitoring this event and we are comparing it with acustomizable threshold of maximum number of dierent location per user. We think that it is important to keep track of this behavior especially incase the application allows this particular usage without any kind of restric-tions and control. Here it is easily possible to use a small number of stolenor automatically created credentials and share it between wide number ofnodes of the botnet. Instead if we put some restrictions on that, the powerof the botnet will be reduced, simply because not every node of the botnetwill be allowed to access to our resources.O-line Auditing scenarios We already described how it is possible todetect automatically the Request Flooding Attack. The other two attackstrategies are more complicated to detect and it might be necessary a post-mortem auditing analysis to nd the source of these attacks. The requests that are generated from a cloned legitimate session, whenJavaScript is enabled, permit the collection of such users actions, like mousemovements and key-pressed through a JavaScript-logger. Thanks to this toolwe are able to reconstruct the users pattern of actions. Analyzing rst the most frequent pattern of actions during the attack, andcomparing its source with our malicious IP sources, stored in the database,can make easier the detection of the attack vector. Sometimes a real legitimate recorded session can be easily detected simplyby analyzing the mouse movement maps from dierent sources. If these mapsare matchable, we have a clear anomaly sign, and then we can trace all thesources that have generated this pattern of actions and submit it into thedatabase. In each scenario we suppose that the attackers can be either always or oc-casionally connected to the target system and submit periodically a sequenceof light common actions with the scope of confusing the intrusion detectionsystem. Only when the attacker (botmaster) believes that it is time to at-tack, he simply changes the actions committed with the one that he knows
  • 57. 48 CHAPTER 4. SOFTWARE ARCHITECTUREare the most heavy to handle for the server (always still legitimate). Thereare no ways to detect this particular attack behavioral analyzing only therequest spectrum of frequency like in [34, 35, 45, 32]. When the requests come from a randomized legitimate recorded session3there is no automatically reliable way to detect it; this makes it one of thehardest attacks that can be detected. There are basically two types of random actions: completely random(detectable throughout the crawler traps injected in the webpages), or, in asuch way, - controlled. The last case, for example, can be: move the mousefrom A to B in a random way, than click- it is very hard to recognize, becauseit introduces a noise that doesnt permit, for example, to apply techniqueslike mouse tracking map matching or something similar. Anyway in this scenario a sort of common pattern is still presented, andit will not be easy to detect it due to the presence of noise, the informationcollected will provide a very useful support for practicing a deep auditinganalysis.Users activity tracking In order to improve the submission of newdatasets in the shared database, a system that can detect the attack pat-tern used by the attacker is necessary. As we want to tackle the problem ofsequence of requests that can be close (if not equal) to a legitimate humanbehavior, we need to audit the trac generated during the attack. For doingthat we need to improve the information collected from the webserver log,and to aggregate this data in a such way that can help detecting similarpatterns of usage. Increasing the information collected from the users is a big trade-o be-cause of the generated overhead, in particular for the storage; on the otherhand if we are not collecting enough information from clients, we are not ableto detect the attack vector and then to nd the sources of the attack. All the information from clients, i.e. actions taken by users interactingwith the WebApp, are collected from the ADL (Activity Deep Logger) ina separate DB. The main logged actions are keystrokes pressed during thelling of a web form, cursor movement tracking, and all information concern-ing the user-agent and the clients environment, i.e. typically informationcollected by common WebAnalytics applications.It also keeps track of requests made on the links / resources injected into thepages as crawler traps.
  • 58. 4.1. LOGICAL DESCRIPTION 494.1.3 ResponseThe core idea behind the response phase is to schedule the incoming requestsaccording to their suspicion priority. This can be done continuously, or justwhen a DDoS attack is detected. The choice depends on the generatedoverhead and can be done by administrator. Request scheduler On the top of every single Web Server front end, areverse proxy, schedules the received requests on a priority based policy. Thescheduling policy is very similar to the one introduced in DDoS-Shield. Themain dierence is that we are focusing especially on the measure of suspicionassigned by our collaborative suspicion mechanism to dene if a particularrequest is allowed to be forwarded to the Application Server or not. One of the design principles of our scheduler is that even malicious re-quests can be forwarded as long as there is a capacity in the system andthere are no lower suspicion requests waiting to be scheduled. This principlemitigates the eect of false negatives in suspicion assignment since legitimatesessions, which may have been inaccurately assigned a high suspicion, stillhave a chance of being serviced. When the session is established, the system exchanges session id with theclients (obviously generated at server side for security reason), and duringthis phase we assign the proper suspicion level to that session according to thepossible previously collected information (stored in the database) regardingthe reputation of that source. Each sessions has a limited and xed time tolive before it expires and needs to be renewed. An extension of the accuracy of the ID of the sources can be done usinga tuple based on IP source and users ngerprints as in [43] .Border Router ltering If the number of simultaneous detected suspi-cious clients overloads the threshold, every single request is checked (if itsnot already done) and the detected suspected IP is notied to the backwardrouter in the internal network. In this way, it is possible to put a lter thatcan help limit the trac generated from these clients (if they are legitimate).The same process is done also when the Monitor Frame detects an overload atthe SmartProxy Layer, as it might be due to an attack against the mitigationsystem. Fitting all the IPs in the reputation database on the border router mayneed too much memory space and generate too much computational over-head. Comparing each incoming IP request with the one included in theblacklist might be very expensive operation from computational point ofview.
  • 59. 50 CHAPTER 4. SOFTWARE ARCHITECTURE That is the reason why we suggest to optimize the ltering operation onthe border routers getting information from the reputational database of allother IPs that make part of the same army in which the notied IP wasinvolved. This should reduce the number of IP included in the temporaryblacklist, allowing to save a lot of memory space for the allocation and theoverhead time needed for ltering. Moreover, it permits to be ready on ltering the trac that could comefrom the same army (i.e. botnet) also before the rst connection tempt. The border routers policy can be: • to drop the packages generated from these IPs, • to shape it to a narrow bandwidth channel.4.2 Models Components Description4.2.1 Border RouterThis is a very common and standard component in our target scenario. Therules of this component in our solution is secondary, but could be usefulin case the introduced smartproxy became the bottleneck and the weakestcomponent under heavy trac, or under attack. What we need from thiscomponent, is the ability to lter as much is possible at the border side ofthe target network. We assume then that the border router is under thefull control of the administrators of the target website, or that can be easilycongured if necessary. The lter can be either bandwidth shaping for clients with low trust level,or a complete drop for clients with a high dangerousness (minimum level oftrust). The technique is very similar to the one introduced by pushback[4],that permits to relieve the load on the components of the system that arestaying on the top of WebApps with an obvious increase in performance.If the monitoring system continues to verify some overloads it will reportthe state to the system administrators, advising them to take the immediateactions to analyze and solve the problem. For making the ltering phase even more successful, than reducing theload on the smartproxy, we dont just lter the incoming request on the baseof the already connected client, but we temporarily store on the lteringrules all the members IP of the botnet in which the malicious client wasdetected in the past, accordingly to the information included in the collabo-rative database. This enhancement provides two eort.
  • 60. 4.2. MODELS COMPONENTS DESCRIPTION 51 The rst: in case of distributed attack like the one-shot described in 3.1.2if we are able to predict from what source the attack might come from, wecan lter it before they reach the smartproxy and later the webserver. The problem is that before the smartproxy starts to lter, it needs toforward to the webserver every new request for establish the connection withthe web-application logic, and then to generate the relative associated sessionID. This rst communication with the webserver could be a bottle neck. Thisis the reason why we need to put a layer of ltering as-a-service a step beforethe smartproxy. In order to make this ltering computationally cheap, wejust lter at IP level, and only if the load at smartproxy becomes heavy. Wethink that its better to lter on the border router just when it is necessary,since ltering on IP-base could inuence also some legitimate clients (falsenegative). The second eort comes from our choice of ltering all the botsinvolved in the same botnet in which the connected client has been seenbefore, accordingly with the information stored in the trust collaborativedatabase. This help us keeping as small as possible the space necessary forstore all the IP that we want to lter, making the ltering process faster thanif we put all the IP stored in the collaborative database. keeping track of the load on the smartproxy, by our test we can gureout of how necessary these components could be.4.2.2 SP: Smart ProxyIt is the kernel of the mechanism for mitigate the suspicious trac. Thelocation of the smartproxy in our proposal is shown in gure 4.2. Its mainfunctions are to forward incoming requests to the Web server that are han-dling the WebApplications, scheduling requests on a suspicious level-basedpriority, assigned to each individual session token. Scheduling Mechanism When a request arrives without a session to-ken or with an expired one, the source IP is searched inside the reputationaldatabase in order to discover if there are existing information about its sus-picious level. To avoid direct access to the DB, in order to increase itscomputational eciency this checking process consists of several steps: 1. Local ecient check for the presence of information for a specic IP. If there is not an IP match, that IP will be scheduled as high priority, otherwise we proceed to the second stage. 2. First the SC (Suspicious Checking) will check-in the presence of the IP in its local cache related to the last visitors, i.e. from last hours (con-
  • 61. 52 CHAPTER 4. SOFTWARE ARCHITECTURE Figure 4.2: Smart Proxy gurable value). Only in case a mismatch (or expiration) is detected in the local cache it will be queried the reputational distributed DB. 3. Finally, on the base of the information received from SC, the request will be scheduled with the right priority in the scheduling queue. All the other requests are simply veried regarding its validity (tam- pering, expiration). If those requests are valid, the SmartProxy should already have the associated suspicious level in his cache.The redirect, VirtualHost based, is made to all the web servers that we wantto protect, and to the ADL (Actions Deep Logger), a host that has a taskof collecting all the information coming from clients relating to the usersactions on the application.4.2.3 SC: Suspicion CheckingThe suspicion checking is the interface component between the reputationalcollaborative database, and the local database of suspicious that has to be
  • 62. 4.2. MODELS COMPONENTS DESCRIPTION 53 Figure 4.3: Scheduling mechanismstored inside the smartproxy cache. Its main task is to provide the suspiciouslevel of the clients that reach the smartproxy. The SC can work either in push or pull mode. In pull mode, it willwait for a request of checking from the smartproxy. As soon as it receivessuch request, it checks the presence of the client ID in its local cache thatis basically related to the last visitors of the webapplication, i.e. from lasthours or another congurable value. Only in case of mismatch in the localcache, or the trust value associated to the client is expired, the SC will querythe reputational distributed database. Its obviously necessary to limit suchkind of query as much as possible since this is a high-cost operation. In push mode instead, it keeps track of the incoming request trough themonitor agent attached, and it continuously store in its local cache the clientstrust information stored in the collaborative database. For each new client,or for the client that has an expired value of trust, it pushs the new updatedvalue to the local cache of the smartproxy.
  • 63. 54 CHAPTER 4. SOFTWARE ARCHITECTURE4.2.4 ADL: Actions Deep Logger Figure 4.4: Actions Deep LoggerIts task is to collect and catalog in a separate DB, any information fromclients, ie actions taken by users interacting with the WebApp.The main logged actions are keystrokes pressed during the lling of a webform, cursor movement tracking, and all information concerning the user-agent and the clients environment, i.e. typically information collected bycommon WebAnalytics applications.It also keeps track of requests made on the links / resources injected into thepages as crawler traps: hidden links necessary for distinguishing the crawleror actions generated by Randomized Legitimated Recorded Session.
  • 64. 4.2. MODELS COMPONENTS DESCRIPTION 554.2.5 Monitoring Figure 4.5: Monitoring The critical components of the model (Smart Proxy, WAS, WA, DB) in-clude agents (MA) for the monitoring of QoS, and more in general for thestate of overload of their sub-components (CPU, Disk, Memory ... Load).Agents are involved in regular monitoring of components to which they are at-tached, and in reporting the results of the analysis to the Monitoring Frame.
  • 65. 56 CHAPTER 4. SOFTWARE ARCHITECTUREThe alerts are mostly for the QoS thresholds, only the agent that is monitor-ing the Smart Proxy has some extended features like monitoring the numberof requests that came from suspected clients in respect to the total.The control center has a task to analyze continuously the reports generatedfrom agents, and in case an alerts thresholds (customizable) exceed, to takethe necessary countermeasures.One of the countermeasures that can be automatically taken when the thresh-olds exceed, is to put some lters as much as possible close to the source ofsuspicious trac (typically border routers of a company).
  • 66. 4.2. MODELS COMPONENTS DESCRIPTION 574.2.6 TCDB: Trust Collaborative Database Figure 4.6: Trust Collaborative DatabaseTCDB is the shared database in which all the information related to theattacks detected from the partners of the collaborative infrastructure arestored. In particular, a trust value is assigned to all the clients that havemanifested some suspicious actions. The trust level for these clients decreaseswith increasing number of attacks recorded from the same client. The clients that during an attack appear to have done the same kind ofrequests or sequences of them, to the web server, are tagged as a part of thesame army/botnet. A particular trust value is assigned to each single botnet, and it is inu-enced by detected attacks coming from these botnets.
  • 67. 58 CHAPTER 4. SOFTWARE ARCHITECTURE In addiction to the information coming from analysis of direct attackson infrastructure partners, the database is enriched also with informationderived from third-party projects as described in paragraph 2.3. With the lapse of time that suspicious clients are still stored in thedatabase, it is necessary to nd a correct policy to update the trust valueof these clients. Same as WOMBAT attributes dierent ID sources to thesame IP address from which there have been recorded the attacks with in-tervals of more than 25 hours. Even here we need to nd a threshold for theredemption of the clients, that are stored in the database of suspects. It will be dicult to x an acceptable threshold without rst having testedour solution in the real world and in the way that is accessible and availableto the real criminals. However, we keep in mind, that a value, which is quite bigger than the25 hours set in WOMBAT, is necessary. In fact, compared to WOMBATthat monitors especially the proliferation of malware, DDoS attacks are typ-ically launched by command of botmaster and these launches usually are notautomatic. Furthermore, since the countermeasures taken for malicious sources arebased mainly on decrease of the priority level of their requests, and not ofdrop of them, we think that it is not inappropriate to adopt a more restrictivepolicy of excessive redemption of the clients. Though, in the worst cases, theclients that have been sanitized in reality, but whose level of trust has notbeen updated yet in the database, may experience a degradation of qualityof service in the use of the web application, but hardly a denial of it.
  • 68. 4.2. MODELS COMPONENTS DESCRIPTION 594.2.7 Auditing Room Figure 4.7: Auditing components:As auditing room we intend the environment for the analysis of data collectedaccording to the activities of the client. The auditing process can be o-line orautomatic. The rst case is when the auditor wishes to perform data analysison the activities undertaken by the clients who were previously connected tothe system, possibly after the detection of a DDoS attack or other anomalies.The Automatic type instead consists in collecting some abuses of use ofsystem resources - the unequivocal signs of attempted attacks.
  • 69. 60 CHAPTER 4. SOFTWARE ARCHITECTUREAutomatic Checking Exist some specic actions of the clients that cantrigger warning signals which may lead to the immediate submission of arecord in the database of who has carried that out. To submit the record in the reputational database we could distinguishbetween clients with some criminal records for which this operation can beautomatic, and clients without records for which this might require a humanintervention. An action that can be automatically intercepted is related to all the re-quests of injected traps in the webpages. The distinction between crawlerand Randomized Legitimated Recorded Session is made by the presenceof JavaScript enabled or not. Indeed a crawler typically does not haveJavaScript enabled, while a Randomized Legitimated Recorded Session cando. In both cases we will be able to tag the IP or set of IP sources with theappropriate suspicious class (medium / low for crawlers, high for the others). Furthermore its possible to detect malicious behaviour like session tokentampered / reused, and high number of multiple dierent sessions from thesame user or client. The automatic suspicious actions set can be extended in relation on theapplication logic of the target web application.o-line The auditing process can be based on a simple analysis of informa-tion gathered by ADL for suspicious activities. While for those activities, forwhich the auditor doesnt consider sucient information collected by ADLfor declaring the illegality or not, it is proposed to correlate this kind of infor-mation with information about the trust of the clients stored in distributeddatabase. The most signicant suspicions which one should consider for more eec-tive operation of auditing, in particular for actions caused by LegitimatedRecorded Session (Random included), are: 1. Mouse Maps Overlapping: overlay of maps of mouse tracking; 2. Sessions cloning: the repetition of the same pattern of requests / actions / time of visitation.Such actions may occur from the same client or dierent clients. The Mouse Mapping Overlapping means the occurrence of perfect match/ overlap of the maps of tracking of the movement of cursor by a client forevery web page he visits. We highly suspect the perfect overlap of these mapsif they had been generated at a short time from the same client, or even fromdierent clients.
  • 70. 4.2. MODELS COMPONENTS DESCRIPTION 61 Another suspect scenario happens when it turns out in short time the se-ries of the same sequence of pages requested from the same client or dierentclients, with a very similar time of visitation per page.4.2.8 Target ServerWe consider as Target Application Server all the basic components necessaryfor running a Web application. These components are mainly a Web Server,an Application Server and a Database. Each of these components might bemerged on a single server or it can be replicated to several standalone server.
  • 72. Chapter 5Software ImplementationIn this Chapter we present rst the Software we used for our experiment, insecond section we give an overview of the implementation model, while inthe third - we describe each single components implementation and, nally,in the last section we describe the testbed used for the experiments.5.1 Required SoftwareIn the following section will be presented all the software types used in thetests, rst referring to those that have a key role in the realization of the pro-posed solution; and secondly citing those nally chosen, even if not essential.5.1.1 SeleniumHQSelenium[77] is a portable software testing framework for web applications.Selenium provides a record/playback tool for authoring tests without learninga test scripting language. Selenium provides a test domain specic language(DSL) to write tests in a number of popular programming languages, in-cluding C#, Java, Ruby, Groovy, Python, PHP, and Perl. Test playbackis possible in most modern web browsers. Selenium deploys on Windows,Linux, and Macintosh platforms. It is open source software, released underthe Apache 2.0 license and can be downloaded and used without charge.The latest side project is Selenium Grid, which provides a hub allowing therunning of multiple Selenium tests concurrently on any number of local orremote systems, thus minimizing test execution time. Selenium-Grid allows the Selenium-RC solution to scale for large testsuites or test suites that must be run in multiple environments. With Selenium-Grid, multiple instances of Selenium-RC are running on various operating 63
  • 73. 64 CHAPTER 5. SOFTWARE IMPLEMENTATIONsystem and browser congurations; Each of these when launching registerwith a hub. When tests are sent to the hub they are then redirected to anavailable Selenium-RC, which will launch the browser and run the test. Thisallows for running tests in parallel, with the entire test suite theoreticallytaking only as long to run as the longest individual test.Selenium IDESelenium-IDE is the Integrated Development Environment for building Sele-nium test cases. It operates as a Firefox add-on and provides an easy-to-useinterface for developing and running individual test cases or entire test suites.Selenium-IDE has a recording feature, which will keep account of user ac-tions as they are performed and store them as a reusable script to play back.It also has a context menu (right-click) integrated with the Firefox browser,which allows the user to pick from a list of assertions and verications forthe selected location. Selenium-IDE also oers full editing of test cases formore precision and control. Although Selenium-IDE is a Firefox only add-on, tests created in it canalso be run against other browsers by using Selenium-RC and specifying thename of the test suite on the command line.Selenium Remote ControlSelenium-RC allows the test automation developer to use a programminglanguage for maximum exibility and extensibility in developing test logic.If the application under test returns a result set, and if the automated testprogram needs to run tests on each element in the result set, the programminglanguages iteration support can be used to iterate through the result set,calling Selenium commands to run tests on each item. It comes basically in two parts. 1. A server which automatically launches and kills browsers, and acts as a HTTP proxy for web requests from them. 2. Client libraries for your favorite computer language. The RC server also bundles Selenium Core, and automatically loads it into the browser.Selenium Remote Control is great for testing complex AJAX-based web userinterfaces under a Continuous Integration system. It is also an ideal solu-tion for users of Selenium Core or Selenium IDE who want to write testsin a more expressive programming language than the Selenese HTML tableformat customarily used with Selenium Core.
  • 74. 5.1. REQUIRED SOFTWARE 65 Selenium-RC provides an API (Application Programming Interface) andlibrary for each of its supported languages: HTML, Java, C#, Perl, PHP,Python, and Ruby. This ability to use Selenium-RC with a high-level pro-gramming language to develop test cases also allows the automated testingto be integrated with a projects automated build environment.5.1.2 TC: Trac ControlTrac control is the term given to the entire packet queuing subsystem ina network or network device. Trac control consists of several distinct op-erations. Classifying is a mechanism by which to identify packets and placethem in individual ows or classes. Policing is a mechanism by which onelimits the number of packets or bytes in a stream matching a particular clas-sication. Scheduling is the decision-making process by which packets areordered and re-ordered for transmission. Shaping is the process by whichpackets are delayed and transmitted to produce an even and predictable owrate. These many characteristics of a trac control system can be combined incomplex ways to reserve bandwidth for a particular ow (or application) or tolimit the amount of bandwidth available to a particular ow or application. One of the key concepts of trac control is the concept of tokens. Apolicing or shaping implementation needs to calculate the number of bytesor packets which have passed at what rate. Each packet or byte (dependingon the implementation), corresponds to a token, and the policing or shapingimplementation will only transmit or pass the packet if it has a token avail-able. A common metaphorical container in which an implementation keepsits token is the bucket. In short, a bucket represents the both the number oftokens which can be used instantaneously (the size of the bucket), and therate at which the tokens are replenished (how fast the bucket gets relled). Under Linux, trac control has historically been a complex endeavor.The tc command line tool provides an interface to the kernel structures whichperform the shaping, scheduling, policing and classifying. HTB Hierarchical Token Bucket is a classful qdisc written by Martin Deverawith a simpler set of conguration parameters than CBQ[84]. There is agreat deal of documentation on the authors site and also on Stef Coeneswebsite about HTB and its uses. Below is a very brief sketch of the HTBsystem. Conceptually, HTB is an arbitrary number of token buckets arrangedin a hierarchy (yes, you probably could have gured that out without my
  • 75. 66 CHAPTER 5. SOFTWARE IMPLEMENTATIONsentence). Lets consider the simplest scenario. The primary egress queuingdiscipline on any device is known as the root qdisc. The root qdisc will contain one class (complex scenarios could have mul-tiple classes attached to the root qdisc). This single HTB class will be setwith two parameters, a rate and a ceil. These values should be the same forthe top-level class, and will represent the total available bandwidth on thelink. In HTB, rate means the guaranteed bandwidth available for a given classand ceil is short for ceiling, which indicates the maximum bandwidth thatclass is allowed to consume. Any bandwidth used between rate and ceil isborrowed from a parent class, hence the suggestion that rate and ceil be thesame in the top-level class. A number of children classes can be made under this class, each of whichcan be allocated some amount of the available bandwidth from the parentclass. In these children classes, the rate and ceil parameter values need notbe the same as suggested for the parent class. This allows you to reserve aspecied amount of bandwidth to a particular class. It also allows HTB tocalculate the ratio of distribution of available bandwidth to the ratios of theclasses themselves. This should be more apparent in the examples below. Hierarchical Token Bucket implements a classful queuing mechanism forthe Linux trac control system, and provides rate and ceil to allow theuser to control the absolute bandwidth to particular classes of trac as wellas indicate the ratio of distribution of bandwidth when extra bandwidthbecomes available (up to ceil). Keep in mind when choosing the bandwidth for your top-level class thattrac shaping only helps if you are the bottleneck between your LAN and theInternet. Typically, this is the case in home and oce network environments,where an entire LAN is serviced by a DSL or T1 connection. In practice, this means that you should probably set the bandwidth foryour top-level class to your available bandwidth minus a fraction of thatbandwidth.TCNGTrac Control Next Generation (TCNG)[85] is a project by Werner Almes-berger to provide a powerful, abstract, and uniform language in which to de-scribe trac control structures. The tcng project provides a much friendlierinterface to the human by layering a language on top of the powerful, but ar-cane tc command line tool. By writing trac control congurations in tcngthey become easily maintainable, less arcane, and importantly also moreportable. The tcc parser in the tcng distribution transforms tcng the lan-
  • 76. 5.1. REQUIRED SOFTWARE 67guage into a number of output formats. By default, tcc will read a le(specied as an argument or as STDIN) and print to STDOUT the series oftc commands required to create the desired trac control structure in thekernel.5.1.3 SMT2: Simple Mouse TrackingSMT2[67] is a simple mouse tracking system to follow and record the com-puter mouse activity on any Web page. Here you will nd the basic in-formation about (smt)2, the next step on remote/in-person mouse trackingtechnology. This tool is focused on three topics: 1. Evaluate a website design. Every website has objectives to achieve, so its design should support the intended goals. 2. Analyze and investigate mouse behavior trends. There are certain mouse behaviors common across many users which are useful in many ways (e.g. increasing the eectiveness of an interface layout). 3. Serve as a multi-purpose toolkit. You can expand the eld of applica- tions. For example: perform data mining tasks over a dened set of users, recruit visitors for a usability test, classify your audience, evalu- ate pointing performance and/or motor abilities... The possibilities are endless!This project is a fully functional Open Source alternative to all those privativemouse tracking systems. You can study the code, adapt it, or even makederivative software. But in any case you must give proper attribution totheir author.5.1.4 OWA: Open Web AnalyticsOpen Web Analytics (OWA)[73] is open source web analytics software thatyou can use to track and analyze how people use your web sites and applica-tions. OWA is licensed under GPL and provides web site owners and develop-ers with easy ways to add web analytics to their sites using simple Javascript,PHP, or REST based APIs. OWA also comes with built-in support for track-ing web sites made with popular content management frameworks such asWordPress and MediaWiki.
  • 77. 68 CHAPTER 5. SOFTWARE IMPLEMENTATION5.1.5 DosmetricDosmetric is a tool developed by Mirkovic et. all [16], which analyzes tractraces collected through tcpdump, by controlling the successful transitions,the failures one and their respective completion time. With this analysis thistool calculates primarily the metrics on Quality of Service and on Denial ofservice noted. The DoS measure is only for failed transactions, not for all transactionsand QoS measure is only for successful transactions. For example: whatDoS of say 6 means is that for those transactions that failed the qualitywas 6 times worse (in some dimension) than the lowest acceptable quality ofservice by humans. Since we are working with HTTP trac that dimensionis request/reply delay that must be below 3 seconds. Multiple studies inQoS have shown that 3 sec is the delay after which humans perceive poorservice[16]. By the way we didnt report measured DoS in results since with theprioritization policy within our prototype we are going to inuence the suc-cessful/failures transictions ratio. Therefore the measured values do not givefair reection of the improvements with smartproxy on.5.1.6 Other used softwareApacheThe Apache HTTP Server[70], is web server software notable for playing akey role in the initial growth of the World Wide Web. The majority of webservers using Apache run a Unix-like operating system. Apache is developedand maintained by an open community of developers under the auspices of theApache Software Foundation. The application is available for a wide varietyof operating systems, including Unix, GNU, FreeBSD, Linux, Solaris, NovellNetWare, Mac OS X, Microsoft Windows, OS/2, TPF, and eComStation.Released under the Apache License, Apache is characterized as open-sourcesoftware. Since April 1996 Apache has been the most popular HTTP serversoftware in use. As of November 2010 Apache served over 59.36% of allwebsites and over 66.56% of the million busiest[71]. Apache is primarily used to serve both static content and dynamic Webpages on the World Wide Web. Many web applications are designed expect-ing the environment and features that Apache provides.
  • 78. 5.1. REQUIRED SOFTWARE 69FirefoxMozilla Firefox[80] is a free and open source web browser descended fromthe Mozilla Application Suite and managed by Mozilla Corporation. As ofOctober 2010, Firefox is the second most widely used browser, with 30% ofworldwide usage share of web browsers. Firefox runs on various operating systems including Microsoft Windows,GNU/Linux, Mac OS X, FreeBSD, and many other platforms. Its currentstable release is version 3.6.13, released on December 9, 2010. Firefoxs sourcecode is tri-licensed under the GNU GPL, GNU LGPL, or Mozilla PublicLicense.MySQLMySQL[68] is one of the most famous open source RDBMS used in the world,with millions of installations and is one of the four generating elements of thenote platform LAMP (Linux Apache MySQL PHP), which can be found atthe base of many web services. A DBMS (Data Base Management System)has the capacity to handle large volumes of data in a multiuser environment.From the physical point of view, or better, from operating system, evendatabase consists of les. However, DBMS allows concurrent processing,maintains the consistency of information while minimizing the redundancy;provides support for tests and prototyping, permits the software indepen-dence from the physical and logical organization of data structures. A DBMS is built on top of the operating system and widens the rangeof access structures provided by the le system. In summary, in a database in-formation management is centralized logically in integrated and non-redundantrepresentation. From user point of view, database is seen as a collection of dif-ferent kinds of data that modeling a certain portion of the reality of interest.Using this system allows better management of information, guaranteeinggreat accessibility, which means less waiting time for the recovery of the de-sired information. It is understood then the importance of being able to relyon a handsome tool suitable for storing of data collected from other elementsof architecture.JoomlaJoomla[72] is a free and open source content management system (CMS) forpublishing content on the World Wide Web and intranets. Joomla is writtenin PHP, stores data in a MySQL database, and includes features such aspage caching, RSS feeds, printable versions of pages, news ashes, blogs,polls, search, and support for language internationalization.
  • 79. 70 CHAPTER 5. SOFTWARE IMPLEMENTATIONXvfbIn the X Window System, Xvfb or X virtual framebuer is an X11 server thatperforms all graphical operations in memory, not showing any screen output.From the point of view of the client, it acts exactly like any other server,serving requests and sending events and errors as appropriate. However, nooutput is shown. This virtual server does not require the computer it isrunning on to even have a screen or any input device. Only a network layeris necessary. Xvfb is primarily[76] used for testing: 1. since it shares code with the real X server, it can be used to test the parts of the code that are not related to the specic hardware; 2. it can be used to test clients in various conditions that would otherwise require a range of dierent hardware; for example, it can be used to test whether clients work correctly at depths or screen sizes that are rarely supported by hardware 3. background running of clients (the xwd program or a similar program for capturing a screenshot can be used to actually see the result) 4. running programs that require an X server to be active even when they do not use it (e.g. Clover html reports)Ubuntu OSUbuntu[78] is a computer operating system based on the Debian GNU/Linuxdistribution and distributed as free and open source software. It is namedafter the Southern African philosophy of Ubuntu (humanity towards oth-ers). With an estimated global usage of more than 12 million users, Ubuntuis designed primarily for desktop use, although netbook and server editionsexist as well. Web statistics suggest that Ubuntus share of Linux desktopusage is about 50%, and indicate upward trending usage as a web server. Ubuntu is composed of many software packages, of which the vast major-ity are distributed under a free software license, making an exception onlyfor some proprietary hardware drivers.[79]
  • 80. 5.2. IMPLEMENTATION ARCHITECTURE OVERVIEW 715.2 Implementation Architecture OverviewThe main objective of the tests is to verify that the mechanism of priorityrequests is eective in the presence of a DDoS attack and does not introducean excessive overhead. We expect from the tests that the technology forauditing described in Chapter 4 may be a useful tool to discriminate betweenlegitimate clients and attackers, after receiving the DDoS. Figure 5.1: Implemented testing architecture In the picture 5.1 you can see the simplied model for the implementation
  • 81. 72 CHAPTER 5. SOFTWARE IMPLEMENTATIONand the test of eciency for the proposed solution. As you can see from thepicture, the target architecture has been reduced to a single instance, therebyeliminating the need for a load balancer. The smartproxy, as in the initialmodel, forwards client requests to the web server and the actions to log - toADL. The information about trust of the clients are imported here directly intosmartproxy thus eliminating the need for suspicious checking that was usedto manage interactions between the smartproxy and collaborative database. The part of the monitoring has been resized to the components consideredvery critical, or the consumption of CPU and memory of smartproxy and webserver, the QoS registered to a client that acts as an external sensor, the QoSrecorded on the level of the WebServer. These metrics are calculated in real-time and stored in a log le for a more detailed analysis at the end of theexperiment. Auditing room was reduced to just one component of auditing o-line,thus excluding from test the mechanisms of automatic detection of clientssuspicious activities, that are described in paragraph 4.2.7. The functionality of the border router, of smartproxy, and ADL stay thesame model introduced in Chapter 4, however their implementation will bedescribed in more details in the following paragraphs.5.3 Implementation Components Description5.3.1 InternetIn the image of the implemented model the cloud labeled Internet includesall the clients that have the potential to interact with the server and hostedweb applications. Among these are legitimate clients, but also compromiseclients belonging to botnets. For simplication we have separated botnetinto two groups: one represent the botnet already known, and another - asunknown yet. Depending on known and unknown botnet, the botnet clientswill then be called kbot or ukbot. As known botnet we intend all thosebotnet types that have clients with existing information (in the collaborativedatabase) regarding their trust level. As in the initial model proposed in the previous chapter, we have main-tained the external sensor as well, to simulate and monitor the QoS experi-mented by any legitimate client.
  • 82. 5.3. IMPLEMENTATION COMPONENTS DESCRIPTION 73Kbot Ukbot clientThe clients that are controlled by botmasters usually launch the attackagainst the server. Each client is composed of a single physical computer, run-ning Ubuntu 10.04, Selenium RC 1.0.3, OpenJRE, Xvfb, Firefox 3.6.13, andsome custom scripts for receiving commands from botmaster, as described inparagraph 5.4.4.Legitimate client SensorThe legitimate clients interact with the web server, these clients have thesame basic software of the bots. The implemented clients act directly as ex-ternal sensors, and for this purpose they include custom script for monitoringoperations of the latency and QoS.Latency check script It is a custom script that monitors the service timeof the web application by sending regularly requests for dynamic web pages,recording the time of completion. The requests are sent via wget disabling the cache, and every 5 secondsthey alternate a computationally light request for a dynamic page with amore heavy request. The heavier request consists of a query research ofdisjoint occurrence of two common words spread on the site. The requiredresults are sorted by popularity. This means that for every occurrence of thepages accessible as a result, server is asked to organize them according to themajor number of visits in real time recorded for each page. The choice of such onerous request is determinated due to the fact thatthe simple request for a dynamic page not necessarily needs to access thedatabase (cache is possible). Thus there is time to get partial response.Taking in consideration the presented attack model, its better to have avision as complete as possible. Meanwhile, we must monitor the responsetimes of all levels of the target system, of the Web server and of database.5.3.2 Border RouterThe machine with Ubuntu operating system that integrates natively the soft-ware for dropping / shaping packets at the kernel level as tc and iptables.It Is the entry point for trac coming from the Internet and direct to theWebServer. It was included in the test mainly to control the Pushback technique de-scribed in paragraph 4.2.1 in case the smartproxy will become a bottleneckof the system and will go in overload during an attack.
  • 83. 74 CHAPTER 5. SOFTWARE IMPLEMENTATION5.3.3 SmartProxyThe smartproxy is the core component in the mitigation mechanism cho-sen in our proposed solution. In order to implement it we have congureda PC with Ubuntu operating system, installing all the software needed tooperate MasterShaper[74] (Apache[70], MySQL[68], Php, jpgraph[65], ph-playersmenu [66]). MasterShaper was chosen initially because it appearedrst to provide the tools for shaping, using such appropriate software as tc,HTB and iptables. It also provides a convenient conguration interface viaweb with the ability to monitor inbound trac by distinguishing betweenthe various congured pipes. Unfortunately, only during the large-scale testing environment in deter-lab, MasterShaper proved unreliable as regards the percentage of incomingtrac graphs. And indeed often the incoming trac seemed almost noth-ing, even in the midst of attack. The implemented limitations on the shapinglevel of recongurability in comparison with the potential of tc /iptables haveroused to abandon the use of such software. In the nal tests for the shaping and scheduling of incoming requests wasused tc[84] in outgoing (on the interface connected directly to the webserver).In fact, the shaping it acts on the outgoing trac and not on incoming, whichcan not be restricted through shaping techniques. To speed up and simplify subsequently optimization of the congurationduring the tests we were entrusted with the libraries included in TCNG[85],though weve got the possibility to code in a C-friendly language. As anexample of code written in that language see the Appendix A.3.1 for anexample of a conguration used. This program was then recompiled by tcc(including parser in tcng), which by default produces commands which invoketc to congure it according to the program specications. The output whichcorresponds to the code in the appendix A.3.1 is shown in A.3.2. In order to facilitate the test phase the simple scripts have been subse-quently developed to make cleaning of the set shaping rules and to load therules required for each experiment prior to its launch. Finally it has been installed the suite of monitoring Sysstat and tcpdumpto track the state of smartproxy during and after every experiment. In par-ticularly, we have paid attention to the consumption of CPU, of memory andtrac in/out coming, considered the parties that are exposed potentially tomajor load.
  • 84. 5.3. IMPLEMENTATION COMPONENTS DESCRIPTION 755.3.4 WSIn PC WS we have combined the layer web server, web application anddatabase. We therefore initially set up a LAMP server: Linux (Ubuntu10.04), Apache2, MySQL, PHP5. Then we installed Joomla [72] on the samemachine as a web application, and some extensions needed to extend thefunctionality of logging: modules J4age [82] and Mod-HTML [88]. J4Age has been introduced already in the previous chapter. Mod-HTML was helpful to rely automatically on each page of web appli-cation JavaScript required by OWA [73] and SMT2 [67] for tracking client. In order to keep the server position monitored there was installed Sysstatsuite and the tcpdump tool for post-mortem analysis of trac with the clientsat the end of each experiment.5.3.5 ADLAdvanced Deep Logger has been implemented on a pc LAMP: Linux (Ubuntu10.04), Apache 2, MySQL, PHP5. All these types of software are requestedto use OWA[73] e SMT2[67]. Before the choice had fallen on SMT2, a test period was dedicated to soft-ware UsaPROXY [89]. An academic software, created to collect informationon the most common actions of a client that interacts with the web applica-tion, to improve the usability of the application itself. These actions include:tracking of the mouse pointer, scrolling pages, lling form in a specic order,the time lag between one action and another; all the necessary information toexplore the most common actions and to improve the interface and usabilityof the site. On the contrary, in our scenario the obtained information proved itselfvery useful in order to discover the repeated actions of the clients- sourcesof attack. Unfortunately, the development of this software stopped afterthe rst associated publication in 2006 [23]. The state of the developmentdemonstrated itself too incomplete and unstable, so we had to look for areplacement and the choice fell on SMT2. The biggest limit of SMT2 in comparison with UsaPROXY is that therst needs to make changes, even if small, in application where you want toinstall it. UsaPROXY instead allowed us to add a software layer betweenthe clients and web application in the way that this operation didnt requireany modication. UsaPROXY changes the sources of the dynamic pages ofthe application in real-time, injecting the JavaScript that was necessary forthe tracking of client actions. While abandoning Usaproxy for SMT2, we were also forced to reconsider
  • 85. 76 CHAPTER 5. SOFTWARE IMPLEMENTATIONthe need to implement in tests the legacy mode as well. All this to sup-port old applications, without making any modication of their code as inSMT2. However, we believe that the technique introduced by UsaPROXYis very useful for implementation of the legacy features, especially in futuredevelopments of the solution we have proposed. During the rst local tests of SMT2 we relied on the latest release availableat that time: the v.2.0.1. That version, however, gave us many problems,and proved to bring many shortcomings of youth. Often web applicationnavigation sessions of a client were not intercepted and though no usefulinformation was recorded. But thanks to collaboration with Luis Leiva -author of the application -it was then possible to correct the bugs in the SVN repository development,which were taken to the next stable release of the application (v. 2.0.2).5.3.6 Static Trust DataBaseNSince no interaction with the collaborative database - the main purposeof this study - we have inserted information on client trust in the Internetdirectly into the memory of smartproxy, according to two simple categories: 1. Known bots 2. Unknown bots / Legitimate clientInstead of using a ner discrimination as described in the presented model,we set our choice on these two categories to achieve the target of the testswe have xed.5.3.7 MonitoringThe monitoring of the status of critical components of the architecture takesplaces on dierent fronts. It happens internally, on the WS and on smart-proxy, and externally as well thanks to a client that acts as an outside sensor. The consumption of CPU and of memory are monitored on WS and onsmartproxy through SAR / sysstat, and network trac - through tcpdump. The QoS experienced by an external client that interacts with the web ap-plication is monitored directly on the sensor. The monitoring is done throughdosmetric (described in paragraph 5.1.5) which run the server through ascript specically designed to store the completion times of more or lessonerous requests. These metrics are very important to have a universal and non-partisantool, which can be used to compare the various solutions of mitigation againstDDoS on the basis of a common yardstick.
  • 86. 5.4. TESTBED DESCRIPTION 77 For completeness, we have introduced an additional tool to calculate theQoS of an external client, with the help of a script and hoc to calculate thelatency in server responses to client requests / sensor. The script is seen inthe appendix A.1.3. The feedback on outcome of the transitions made between clients andserver, thus on the load of webserver, will be achieved by verifying the testresults of selenium. Considering the results of the recorded sessions (through SeleniumIDE)run by selenium server we can expect that a high number of transitions madeby single client is indicative of a low load level on the web server. Moreover,the number of successful sessions of selenium compared with those failureswill be a useful signal to understand the eectiveness of prioritization systemof implemented requests.5.3.8 Audit RoomAs audit room for testing we mean the environment of virtual auditing,oered by the union of all the basic tools of logging of software that we use,and in particularly, of those that we have introduced in ADL. The possible methods of auditing in the test are mostly manual, and noautomatic check techniques have been implemented in comparison with theprevious chapter, as it is not considered as the main purpose of this study.5.4 Testbed descriptionIn order to carry out the tests that were as accurate as possible, we wantedto take the road of emulation, rather than the one of simulation. Consideringthat the model of attacking is highly interactive with the target server, sameas one taken into account in our study, the emulation way was almost oblig-atory. There are many studies showing the importance and major relevanceof the results of emulation tests, than of simulation. In order to get an environment in which it is possible to scale-up sig-nicantly in order to create botnet of a certain size, initially we thought tobring the experiment on the platform of cloud computing oered by Amazon:Amazon EC2. This platform is also fully supported by SeleniumGRID fordeployment and for control tests made on large scale. But the fact, that Amazon EC2 was not free of charge, led us to discardthis possibility. Thats why our choice was DETERlab - an academic project specicallyborn to provide a platform for medium-scale for scientic experiments in the
  • 87. 78 CHAPTER 5. SOFTWARE IMPLEMENTATIONComputer Security elds.5.4.1 DETERlab: cyber-DEfense Technology Experimen- tal Research laboratory TestbedThe DETERlab testbed is a general-purpose experimental infrastructure thatsupports research and development on next-generation cyber security tech-nologies. The testbed allows repeatable medium-scale Internet emulationexperiments for a broad range of network security projects, including exper-iments with malicious code. The DETERlab testbed uses the Emulab cluster testbed software devel-oped by the University of Utah. This software controls a pool of PC ex-perimental nodes that can be assigned, interconnected with high-speed linksin nearly-arbitrary topologies, loaded, and monitored remotely, to meet therequirements of each experiment. Experimenters use the DETERlab webinterface to dene, load, control, and monitor their experiments remotely. DETERlab is composed of two linked clusters: one at USC ISI and theother at UC Berkeley, with a total of about 400 experimental nodes. Funding for DETERlab has been provided by the US Department ofHomeland Security (DHS) and the National Science Foundation (NSF). DE-TERlab supports a collaborative community of academic, government, andindustrial researchers, allowing them to safely run reproducible experimentson system and network attacks and countermeasures. The underlying pur-pose of DETERlab is to advance the science and art of computer security. Inaddition to building, operating, and maintaining the experimental infrastruc-ture, the DETERlab project also performs RD on security testbed design.for DoS.5.4.2 Hardware Cluster DetailsThe nodes in DeterLAB inherit the hardware characteristics of the clusterto which they belong. There are several types of cluster in DETERlab, andwe should describe at least those on which the nodes associated with ourexperiment have been instantiated. Then we can narrow the eld to clusterof class pc2133, PC3000 and, given below. The associations node componentare then all set entirely in the table in the image A.1.pc2133 class machinesThe pc2133 and bpc2133 machines have the following features:
  • 88. 5.4. TESTBED DESCRIPTION 79 • Dell PowerEdge 860 Chasis • One Intel(R) Xeon(R) CPU X3210 quad core processor running at 2.13 Ghz • 4GB of RAM • One 250Gb SATA Disk Drive • One Dual port PCI-X Intel Gigabit Ethernet card for the control net- work (only one port is used). • One Quad port PCIe Intel Gigabit Ethernet card for experimental net- work.CPU ags: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36clush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pnimonitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lmpc3000 class machinespc3000 and bpc3000 have the following features: • Dell PowerEdge 1850 Chassis. • Dual 3Ghz Intel Xeon processors. • One 36Gb 15k RPM SCSI drive (bpc machines may be congured with two). • 4 Intel Gigabit experimental network ports. • 1 Intel Gigabit experimental network port.pc3060 and bpc3060 machines are the same as the pc3000/bpc3000 machinesexcept that they have one more experimental network interface. pc3100 machines have a total of 9 experimental interfaces and 1 controlnetwork interface. There are only 4 of these type of machine.
  • 89. 80 CHAPTER 5. SOFTWARE IMPLEMENTATION5.4.3 SEERThe Security Experimentation EnviRonment (SEER) is a set of tools andagents that helps an experimenter setup and script; and performs experi-ments in the DETER environment. It includes agents for trac generation,attack generation, trac collection and analysis. SEER provides: • an extensible Java GUI interface • a module system for adding your own agents, collectors, aggregates or services • a module/software dependency setup with building and caching of 3rd party softwareThe picture shown below is a screenshot of the GUI. In particularly, it showsthe topology of the experiment with the medium-size botnet, in which thestate of every single node is involved. The green nodes are the those for which some network trac was detectedand reported to the control node that handles the communications with theSEER GUI. Figure 5.2: SEER GUI screenshot
  • 90. 5.4. TESTBED DESCRIPTION 815.4.4 Custom ScriptThe control of the botnets is made through bash scripts and SSH connections.We have developed a BASH library called batchlib.cfg that permits to sendcommands to the bot during the experiments. This library is included inAppendix A.1.1. Each bot also includes a daemon that once started it listens to the com-mand, given by the botmaster in order to launch sessions of utilization ofweb application. These sessions are based on parameters communicated bybotmaster to the nodes before or during the Launch Selenium Browsing SessionWhen any experiment is run, each bot receives a command to start thedaemon that manages the launch of browsing sessions on web application. The deamonized script code is shown in the le in ap-pendix A.1.4. . The nodes of the Known and UnKnown botnet have separate scripts toread the specic conguration parameters of the botnet to which it belongs.This dierentiation guarantees the major exibility in testing, though it per-mits us getting closer to a realistic scenario. In the real scenario in fact itmay happen that distinct botnets are controlled by dierent botmasters andact independently or they coordinate with each other in other to cooperatein the same attack in the common way. The script reads specied parameters within two congu-ration les distributed to all bots via NFS. One of these le includes globalparameters specic of the run experiment. These parameters are: the exe-cution time (typically xed to 10 minutes), the delay between the launchedand the following sessions and the label of the running test. In the specic conguration le for each botnet with the childs parameterit is indicated the number of concurrent sessions that has been launched fromeach single bot. Once loaded the conguration parameters, a selenium-server instance foreach specic child is launched. Every server instance will act as a bridgebetween Firefox and the actions, specied in the session le. An exampleof a session le used for the client is shown in the table 6.23, the graphicalrepresentation of the html les parsed by the selenium server. The results of every single selenium session launched toward the webserverare collected in les with the same label of the ongoing test, the timestampand the name of the node that have generated the specic session.
  • 91. 82 CHAPTER 5. SOFTWARE IMPLEMENTATION Therefore at the end of each experiment it is possible to trace the numberof sessions run by each node, and among these nodes - to distinguish betweenthose successful and unsuccessful. The ability to collect this information willbe very important to verify the eciency of the proposed infrastructure.5.4.5 Local test porting to DETERlabThe porting of the tests to DETERlab environment was not trivial at all. Westarted with the deploy of small experiments to familiarize with the oeredenvironment. Then we ported on it WS and ADL components, and thisprocess was unexpectedly complicated. The main problem here was that theDETERlab, being a secure environment, had no direct access to the Internetnetwork. To access to any node of the experiment rst you must accessin SSH to any internal machine, and only after you can connect from thismachine to any desired node. This meant that any software that had to be congured via web interface,mainly Apache, SMT2, OWA, had to be congured only via port-forwarding. So the port 80 of the nodes that had to be congured via web interface hasbeen remapped on a local port. If this was not a problem for Apache, therewere problems with less stable and popular software as SMT2 and OWA.Unfortunately, we tested that such unusual conguration was not contem-plated on these software. Incorrect handling of cookies by OWA and SMT2did not allow to handle this exception, and it wasnt possible to complete theinstallation of such software on ADL.Learned lesson When you work on such custom and in such outsourcingenvironment as DeterLab, even the most trivial issue that has to be resolvedon a local system, may not be solved as easily as it may seem. The variablesinvolved in large scale systems as deter are almost incalculable. The most critical factor here is the fact that we cannot physical accessour servers in any way. In general, when you take an eort from outsourceenvironment, it becomes even more dicult to answer the classic questionfor a system administrator: Did I make a mistake, or the encountered issuedepends on infrastructures bug?. By the way, collaboration with DETERlabs system administrators wastherefore essential and facilitated by an internal ticketing system. On theother hand, the dierent time zone with the united states has incrediblyslowed this process, since the working hours were virtually complementary. A decisive factor was also working close with developers of SMT2 andOWA which allowed to nd and then x the found bugs.
  • 94. Chapter 6Test EvaluationIn this chapter we expose in detail our experiments. In paragraph 6.1 wedescribe the procedures adopted to conduct the experiments, in paragraph6.2 - a full report of all tests results. Finally, in paragraph 6.3 we presenta brief summary with the congurations that have proven to oer the bestperformance.6.1 Test DescriptionBefore we were able to gather information from our experiments, a longperiod of debugging has passed. It was dedicated mostly to solving problemsinherent to DETERLab or related to conicts with the software we choseand the testing environment itself. In order to use the resources oered in DETERlab, without the risk ofabuse, we have created scripts to automate the procedures for experimentsas much as possible. Some of these scripts are described in paragraph 5.4.4,others are part of the library batchlib.cfg included in the appendix A.1.1. In this section we rst describe all the steps needed to run a single ex-periment, and then the main parameters that can be varied to characterizethe individual experiments.6.1.1 Single Run ExperimentFor a single experiment we intend a test where conguration parametersremain constant until the end of the xed time of the experiment duration.Each such experiment lasted 10 minutes. This amount of time allows each botto launch a sucient number of requests to take down the target server. Atthe same time this duration permits to run tests with dierent congurations 85
  • 95. 86 CHAPTER 6. TEST EVALUATIONand without having to wait needlessly for hours. The chosen time value was also used in similar studies, including thosewe nd in Mirkovics work [16, 18, 17]. Choosing the same value facilitatesthe comparison between evaluation results for dierent solutions. The global parameters are modied before the launch of each test, inparticular, its label and the activation ag of the smartproxy. After the conguration, the actual experiment is launched using the rou-tine run_exp inside the library batchlib.cfg.Algorithm 6.1 run_exp routine function run_exp { 1) source $sw/bot/global.cfg 2) rr client1 $sw/ $rname $ethcl 3) pidcl=$! 4) botd start #backbone fw-sp monitoring 5) rr sp $sw/ $rname $ethsp_ext 6) rr sp $sw/ $rname $ethsp_ext _ext 7) rr sp $sw/ $rname $ethsp_lan _lan 8) rr sp $sw/ $ethsp_ext /users/davide/log/$ns-sp_ext-pre--$ 9) rr sp $sw/ $ethsp_lan /users/davide/log/$ns-sp_lan-pre--$ #WebServer monitoring of cpu/network 10) rr ws $sw/ $rname $ethws 11) wait $pidcl 12) sleep 10 13) kill_exp }
  • 96. 6.1. TEST DESCRIPTION 87run_exp The above routine run_exp may help the reader to understandthe procedure used in individual tests, and the order in which the variousinfrastructure components are evoked. We can translate the code to the following sequence that describes theoperation line by line: 1. Global parameter loading from le g lobal.cfg; 2. Run selenium sessions on the legitimate client and latency monitoring on the sensor side; 3. Save the legitimate clients process id (pid); 4. Sequential activation of the botd daemon on each bot: starting from the rst bot in the unknown botnet (ukbot1) until the last bot in the known botnet (kbot30, kbot60 or kbot120); 5. Launch CPU and memory monitoring on the smartproxy; 6. Dump Network trac on the smartproxys external interface; 7. Dump Network trac on the smartproxys internal interface; 8. Request buer state saving on the smartproxys external interface; 9. Request buer state saving on the smartproxys internal interface; 10. Launch CPU and memory load monitoring on the web server; 11. Waiting the end of selenium sessions on client. Later botd daemon is shouted down on each bot, following the same bot order used for launching it; 12. Wait 10 seconds; 13. call of kill_exp routine that ends all the monitoring services. Logs of the monitored services are stored directly in user space accessiblevia NFS and shared by all nodes in the experiment. In this way logs can bepreserved even after swapout experiment, which forcibly occurs after 4 hoursof idle.
  • 97. 88 CHAPTER 6. TEST EVALUATION6.1.2 Parameters TuningTo see which are the most important parameters that inuence performanceof the tested system, we have invested much time in denition of the seleniumsession that could stress more the server without doing a simple ooding ofrequests. Though we set a timeout of 120 seconds for each single action of theselenium session. We found this value a good solution to the tradeo betweenmaximum number of completed sessions under load (without failures) and tomaintain high interaction with the web server. Initially the dierent selenium sessions were used to verify how much loadon the type of system can generate such requests. The dierence betweenheavy and light requests is crucial, but to reduce the scope of testing, wemade a mixed session, which contains both heavy that light requests. Theselenium session was the same used for all experiments in both legitimateclient and bots. The only dierence was in user that clients usually apply tologin in system. This dierentiation simplied the auditing process. The demon of every bot was also ready to launch dierent concurrentsessions, using the parameter childs. The concurrent sessions allowed us topreserve DETERlabs resources in the initial test phase. While for nal testwe gave priority rst to scale-up on the number of physical PCs connectedto the system for having more accurate data, rather than having to simulateincrease in the size of botnets through the launch of more concurrent sessionsby each bot. The size of the botnets used in experiments is shown in tablebelow 6.1: Size # known bot # unknown bot small 30 15 medium 60 30 large 120 30 Table 6.1: Number of pysical pc associated to each botnet The parameter, to which we payed more attention during the nal tests,was the size of the buer associated with each classes in smartproxy, andits maximum associated bandwidth. The class in smartproxy we refer tothe classic nomenclature used for HTB as described in 5.1.2. In our specic:each class in smartproxy has an associated priority, as reported in Table 6.2.
  • 98. 6.2. TEST RESULTS 89 Priority Value Priority Label 1 Top 2 High 3 Fallback 4 Medium 5 Low Table 6.2: HTB Priority Classes High priority is given in tests to legitimate client, and in some scenarioseven to clients of the unknown botnet. The clients from known botnet insteadare always treated with low priority.6.2 Test ResultsAfter checking the parameters that most inuence the results of the exper-iments and the performance of the tested solution, we have narrowed theireld acting on: • Smartproxy on / o.In case of smartproxy on, we can act on it, making the following variationsin: • Class queue - Buer size • Class queue - Maximum Bandwidth reservedThen we repeated the experiments on dierent botnet combinations accord-ing to the Table 6.1. In the rst test only the legitimate client interacted with web server with-out any other bots. This situation has allowed to dene the parameters ofthe test system in steady state, with and without active smartproxy. After the experiment with only the clients we have introduced the botnetas outlined in schema of the Table 6.1. For each botnet conguration someexperiments were realized rst by associating the low-priority to the clientsof botnet Known, and then by checking the downgrading of the Unknownbotnet clients from legitimate (high-priority) to suspicious (medium priority).
  • 99. 90 CHAPTER 6. TEST EVALUATION6.2.1 Legitimate Client onlyThe rst test refers to the only active client, which launches the sessions ofselenium and monitors the status of the system. The client detects the systemaverage latency, or better without the interference of trac and overload onthe server that can be generated by more connected bots. The results of the rst test with the smartproxy o are shown in the Table6.3. client-type passed % passed latency %CPU WS %CPU SP QoS legitimate 7 100 0.241 2.65 0.68 0.9586 Table 6.3: [test] Client only, sp=o Subsequently we activated the smartproxy to detect the introduced over-head and the measured metrics, which are given in Table 6.4. Comparingthe two tables we can notice that the activation of smartproxy doesnt in-troduce overhead visible to the system. The detected parameters with theactive smartproxy are very close to those reported with smartproxy o. We should take in consideration as well that the smartproxy CPU remainsstable in both experiments at a very low value, this shows that the workdone by smartproxy is not too onerous in terms of required computationalresources. The CPU parameter is the one that we considered most criticalfor smartproxy, and therefore we reported it in the table. Nevertheless, werealso tracked the memory and the load on the disk, but considering that theywere absolutely discharged of work, we didnt report their measured values. client-type passed % passed latency %CPU WS %CPU SP QoS legitimate 7 100 0.244 2.66 0.68 0.9584 Table 6.4: [test] Client only, sp=on: buer=10, band=64Kbps6.2.2 Small botnetRespecting the compositions indicated in the Table 6.1, we introduced thebots that interact with the web server by creating legitimate trac. There-fore, we have 30 bots in the botnet Known, and 15 in the botnet UnKnown.Each bot is running the same selenium session of the client, though the samesequence of requests. Only the requests from Known botnets undergo the in-tervention of smartproxy. Initially we assume that they do not have a priorinegative trust information for the unknown botnet client components. Theseclients will therefore be treated as legitimate.
  • 100. 6.2. TEST RESULTS 91 The rst test allows to verify the status of the system with the smartproxyo. Therefore, we will obtain the values of the system under attack, whichwe will have to improve thanks to the activation of smartproxy during theattack. We can verify the measured values in Table 6.5. client-type passed failed total % passed legitimate 3 2 5 60 UnKnown botnet 72 18 90 80 Known botnet 128 40 168 76.19 Table 6.5: [test] sp=o By activating the smartproxy, we had to set the parameters to inuencethe priorities of Known botnet service requests. Thanks to a preliminary testphase in which it was veried that the system responds to changes of variousparameters, we found more satisfactory responses by varying the buer sizeand bandwidth devoted to the penalized class. In the tests presented belowit is possible to see how by varying the buer queue of low priority and themaximum bandwidth granted to the queue, it is possible to mitigate theinuence of bots belonging to the Known botnet without dropping directlytheir demands. client-type passed failed total % passed legitimate 7 0 7 100 UnKnown botnet 91 11 102 89.216 Known botnet 76 54 130 58.462 Table 6.6: [test] sp=on: buer=10, band=64Kbps. In the Table 6.6, it is possible to see the signicant improvements thatoccur with the activation of smartproxy. Comparing these results with thoseof the Table 6.5, we notice an increase of the number of completed sessions,and an improvement in the percentage of correctly completed sessions for thelegitimate client and for the unknown botnet client that in fact is treated aslegitimate in these test. The increase in the number of completed sessionsis already an indication of the fact that the prioritization of requests haveworked. In addition, the higher percentage of successfully completed sessionsis a further indicator of the eectiveness of that system.
  • 101. 92 CHAPTER 6. TEST EVALUATION client-type passed failed total % passed legitimate 6 1 7 85.714 UnKnown botnet 88 13 101 87.129 Known botnet 67 61 128 52.344 Table 6.7: [test] sp=on: buer=1, band=64Kbps. Further tests were carried out by tuning the buer size and bandwidthdevoted to the low priority class and we can see the improvements of theseexperiments from the table 6.7 and 6.8. client-type passed failed total % passed legitimate 7 0 7 100 UnKnown botnet 104 1 105 99 Known botnet 13 78 91 14.286 Table 6.8: [test] sp=on: buer=1, band=32Kbps. At this table we have found the best conguration for this type of attackconguration, giving high priority to legitimate clients and clients of theunknown botnet, and instead minimizing the priority of the known botnetsbot. The setup used in the table 6.8 is realistic if the level of trust of the l-tered botnet, is among the lowest in the range adopted in the shared database.Otherwise, a conguration like the one shown in Table 6.6 is already accept-able, oering a good compromise between the priority given to legitimate andunknown clients, and a good continuity of service even for clients belongingto known botnet. Certainly an improvement indicative parameter introduced by the acti-vation of smartproxy is the average latency measured by the external sensor. It is possible to compare the measured latencies from the legitimate client- which in our experiment also acted by external sensor - in Table Rening priority after rst attackIn the proposed solution we gave an importance to the choice of adopting apolicy of mitigation and prioritization of requests, thanks to its ne ltering.We chose not to deal equally with all clients for whom we had informationfrom previous suspicious actions, such as if all were malignant. Therefore itwas necessary to test a dierent type of prioritization for clients belongingto two distinct classes of trust.
  • 102. 6.2. TEST RESULTS 93 We have assumed in this series of experiments the minimum trust forclients belonging to the botnet Known, and a medium trust for those be-longing to botnet Unknown. The client belonging to the declassication ofunknown botnet by legitimate clients, to client (mighty) malicious, it can bedue to internal auditing actions on your system or other partners. In fact,checking the suspicious actions of clients belonging to botnet unknown, butfor which there is no other information available in the database about pre-vious attacks, well act on them with a slight reduction of the priorities oftheir requests. client-type passed failed total % passed legitimate 7 0 7 100 UnKnown botnet 102 3 105 97.143 Known botnet 10 81 91 10.989Table 6.9: [test] sp=on: ukbot-prio=medium, buer=10,band=120Kbps; kbot-prio=low, buer=1, band=32Kbps. In the rst tests the smartproxy was congured to distinguish between twodierent classes of low-priority clients. The 15 clients of the botnet UnKnownare directed to a channel at medium priority (see Table 6.2), while the 30clients of the botnet Known remain in low priority class. client-type passed failed total % passed legitimate 7 0 7 100 UnKnown botnet 101 4 105 96.190 Known botnet 18 75 93 19.355Table 6.10: [test] sp=on: ukbot-prio=medium, buer=5,band=64Kbps; kbot-prio=low, buer=1, band=32Kbps. Comparing the performance of the selenium sessions in the tables 6.9,6.10 and 6.11 it is possible to notice how the service of legitimate clientis guaranteed, and at the same time the clients belonging to the unknownbotnet have a good priority demonstrated by the number of completed ses-sions, from which the high one makes the percentage of sessions concludedsuccessfully.
  • 103. 94 CHAPTER 6. TEST EVALUATION client-type passed failed total % passed legitimate 7 0 7 100 UnKnown botnet 97 8 105 92.381 Known botnet 10 80 90 11.111Table 6.11: [test] sp=on: ukbot-prio=medium, buer=1,band=64Kbps; kbot-prio=low, buer=1, band=32Kbps. From the sessions analysis we cant see any improvement of the serviceoered to legitimate client, for this reason we postpone the reading of thelatency values, reported in Table 6.23, from which it is possible to notice thatthe better conguration in this experiment is the one of the test, whichguarantees the best response time to the legitimate client, while maintaininggood service times even to clients of the botnet unknown. Though it passedfrom latency of 1.807 and QoS 0.882054 with smartproxy o to latency 0.477and QoS 0.942247, - denitely a great improvement in performance.6.2.3 Medium botnetAfter verifying the eectiveness of the solution tested with a limited numberof PCs, we have doubled the size of both the botnets. Known-botnet hasgrown from 30 to 60 pc, and the Unknown from 15 to 30. We then performedthe test trying to reect the congurations adopted in the small scale tests. client-type passed failed total % passed legitimate 4 0 4 100 UnKnown botnet 108 8 116 93.103 Known botnet 185 27 212 87.264 Table 6.12: [test] sp=o In Table 6.12 we have shown the trend of the sessions in the absence ofsmartproxy. We expect that by increasing the number of clients connectedto the system there is a general degradation in performance, in particularly,in those encountered by the legitimate clients. Comparing the table 6.12with the analog in smaller-scale (Table 6.5), it is possible to think that thereis an anomaly and that instead of a decrease in performance there is animprovement with the smartproxy o. In order to understand better theseresults it is necessary to take into account one more time that average latencypassed with 1.807 seconds in the experiment with small-sized botnet, andgrow up to 10.464 seconds in the experiment with medium-sized botnet.
  • 104. 6.2. TEST RESULTS 95 client-type passed failed total % passed legitimate 6 1 7 85.714 UnKnown botnet 137 43 180 76.111 Known botnet 23 144 167 13.772 Table 6.13: [test] sp=on: buer=10, band=64Kbps. Activating smartproxy with the default values, we have noticed an im-provement, rather than in the percentage of completely ended sessions, ingrowing numbers of total successfully completed sessions of legitimate clients.The latency measured on the client passes from 10.464 seconds to 1.417 sec-onds, and it is a further conrmation of the successful operation of prioriti-zation. client-type passed failed total % passed legitimate 5 1 6 83.333 UnKnown botnet 127 42 169 75.148 Known botnet 26 156 182 14.286 Table 6.14: [test] sp=on: buer=1, band=64Kbps. By reducing the sizes of buer from 10 to a single stored packet, wedidnt nd any signicant improvements of mitigation system. Once againwhile analyzing the status of selenium sessions, it seems that there has beena degradation in performance, but going back to check the latency and QoSfrom the table 6.23 it is possible to see, that inspite of small enteties therehave been made an improvement between tests and client-type passed failed total % passed legitimate 7 0 7 100 UnKnown botnet 151 30 181 83.425 Known botnet 0 116 116 0 Table 6.15: [test] sp=on: buer=1, band=32Kbps. By reducing the maximum bandwidth connected to the to low prioritychannel, leading it from 64Kbps to 32Kbps, we noticed further improve-ment. The main eort is seen in the selenium sessions same for legitimateclient, same for those belonging to the unknown-botnet, that have increasedin number and percentage of those successfully completed. The sessionscoming from known-botnet has been more penalized and in fact, even if their
  • 105. 96 CHAPTER 6. TEST EVALUATIONpackages were not directly dropped, no selenium session was able to completeitself before encountering server timeout.Figure 6.1: [test] WSs CPU Load / Trac between BorderRouterand WSWeb Server Load It is interesting to note the change of the load on theserver based on the activation state of smartproxy. In gure 6.1 it is possibleto see the CPU load of the Web server during the 10 minute experiment withsmartproxy o. It is possible to notice that the CPU is almost always on thelimit values except for three bumps. By linking the CPU load with the oneof the network measured between smartproxy and webserver, it is possibleto see that every bump in the CPU load is always preceded by a spike innetwork trac of the blue line on the graph.
  • 106. 6.2. TEST RESULTS 97 The blue line indicates the outbound trac from web server towards theclients in Internet. We can motivate the reversed peaks in server CPU loadas the termination of satisfaction of the recieved requests that going to besadised in major number after 20, 230 and 425 seconds from the beginningof the experiment. After satisfying the demands of the clients by sendingthem the required resources, the web server has a period of discharge work.During these time intervals the clients elaborate the responses received fromserver to be able to send new requests to it. With the activation of smartproxy in its optimal conguration, namelyof test, we can see the trend of the CPU load in image 6.2.Figure 6.2: [test] WSs CPU Load / Trac between BorderRouterand WS In this conguration it is clear that the overall load on the servers CPU
  • 107. 98 CHAPTER 6. TEST EVALUATIONis much lower than the one in the absence of smartproxy. In the aboveCPUs graphic it is possible to notice more gutters in comparison with theCPUs graphic in gure 6.1. Analyzing the graph for the network trac it ispossible to notice web serber trac spikes coming just before the subsequentdischarge of CPU, to strengthen the same argument given for the test Rening priority after rst attackWe tested the eectiveness of the adoption of a ner scale prioritization,leaving to botnet known the low priority level as before, and by associatingto unknown botnet clients the medium priority, instead of treating them aslegitimate. Reecting the format used for other experiments, we report in the tables6.16, 6.17 and 6.18 relative results. client-type passed failed total % passed legitimate 5 1 6 83.333 UnKnown botnet 151 28 179 84.358 Known botnet 1 159 160 0.625Table 6.16: [test] sp=on: ukbot-prio=medium, buer=10,band=120Kbps; kbot-prio=low, buer=1, band=32Kbps. client-type passed failed total % passed legitimate 5 1 6 83.333 UnKnown botnet 156 27 183 85.246 Known botnet 0 162 162 0Table 6.17: [test] sp=on: ukbot-prio=medium, buer=5,band=64Kbps; kbot-prio=low, buer=1, band=32Kbps. client-type passed failed total % passed legitimate 7 0 7 100 UnKnown botnet 84 43 127 66.142 Known botnet 1 166 167 0.599Table 6.18: [test] sp=on: ukbot-prio=medium, buer=1,band=64Kbps; kbot-prio=low, buer=1, band=32Kbps. Even if in the rst two congurations the improvement is quite poor,in the experiment instead the results of the selenium sessions are
  • 108. 6.2. TEST RESULTS 99excellent. Comparing the latency and QoS values of the table 6.23 it ispossible to get a higher view which conrms the data collected in Table 6.18. Figure 6.3: [test vs.] WSs CPU load.Web Server Load Comparing the graphs of CPU load in the most ecientsmartproxys conguration in the scenarios that precede and follow the rstattack, there is a drain on the CPU load in test due to lteringpolicies of the 30 unknown-botnets client, downgraded to medium priority.
  • 109. 100 CHAPTER 6. TEST EVALUATIONThe average CPU values calculated for experiments and arerespectively 64.47 and 38.99 percent. The average CPU values are muchmore meaningful than simple graphs, that with a quick look, do not tell thereal relief under the second scenario. In fact, the average load between therst and the second scenario was halved.6.2.4 Large botnet Figure 6.4: Large-size botnet screenshotThe latest tests were carried by doubling the known-botnet from 60 to 120physical PCs. In the image 6.4 it is visible a screenshot extracted from theSEER during the launch of an experiment that shows how all the involvedPCs cooperate together to attack the target web server, thanks to commandsfrom the hub node. Here are the results of the selenium sessions. client-type passed failed total % passed legitimate 4 0 4 100 UnKnown botnet 55 5 60 91.667 Known botnet 187 49 282 79.237 Table 6.19: [test] sp=o In Table 6.19 are shown results with disabled smartproxy, and in the
  • 110. 6.2. TEST RESULTS 101table 6.20 - with smartproxy on, congured with more restrictive parametersaccording to the results of previous tests. client-type passed failed total % passed legitimate 4 0 4 100 UnKnown botnet 73 3 76 96.053 Known botnet 228 54 282 80.851 Table 6.20: [test] sp=on: buer=1, band=32Kbps. As have been already veried in other experiments, the values of the sele-nium sessions do not give fair reection of the improvements with smartproxyon. Reading the values shown in Table 6.23 it is possible to verify a slight in-crease of QoS and especially the reduction of the average latency experiencedby legitimate client that passes from 19.262 to 12.428 seconds. Rening priority after rst attackEven in the extended conguration we tested the eectiveness of the adoptionof dierent levels of priority for suspected clients. Downgrading unknown-botnet clients from legitimate to suspicious, we decreased their priority fromhigh to medium. In table 6.21there are indicated the results with smartproxy o. Theseresults should be very similar to those experienced in test, as thenumber of connected clients in both experiments is the same. This factshows that our tests are relatively consistent and repeatable. client-type passed failed total % passed legitimate 4 0 4 100 UnKnown botnet 78 8 86 90.698 Known botnet 199 73 272 73.162 Table 6.21: [test] sp=o. In Table 6.22 we have shown the results of the selenium sessions thatshould be integrated with the values of latency measured from legitimateclient. These values are indicated in Table 6.23, which pass from 17.833 to12,704 seconds.
  • 111. 102 CHAPTER 6. TEST EVALUATION client-type passed failed total % passed legitimate 4 0 4 100 UnKnown botnet 53 9 62 85.484 Known botnet 200 41 241 82.988Table 6.22: [test] sp=on: ukbot-prio=medium, buer=1,band=64Kbps; kbot-prio=low, buer=1, band=32Kbps. Finally, we summarize in table 6.23 the most signicant parameters, suchas average latency and QoS experienced by legitimate client; and the averageservers CPU consumption of each experiment described before. botnet size test avg. latency [sec] clients QoS WS CPU [%] 1.807 0.882054 73.56 small (1) 0.643 0.930064 54.84 0.638 0.927713 55.46 0.484 0.934500 42.11 0.536 0.935606 42.52 small (2) 0.477 0.942247 41.80 0.497 0.938696 41.66 10.464 0.904182 86.34 medium (1) 1.417 0.895620 73.68 1.261 0.901333 71.26 1.142 0.914047 64.47 0.759 0.902178 60.01 medium (2) 0.854 0.900711 61.09 0.410 0.947501 38.99 large (1) 19.262 0.923643 94.86 12.428 0.934334 95.43 large (2) 17.833 0.927939 94.61 12.704 0.945134 95.26 Table 6.23: Test Summary: (1) rst attack, (2) after rst attack The QoS parameters, calculated according to tool of Mirkovic (dosmet-ric), and CPU load are good signals of the system state and of adoptedcountermeasures. However, we believe that the latency measured from legit-imate client is the most reliable and unambiguous value of eectiveness ofsmartproxy activation in all the experiments.
  • 112. 6.2. TEST RESULTS 1036.2.5 Auditing ProcessAll selenium sessions launched by clients to the server were the same (exceptthe login username of the legitimate client) . Therefore it was estimated thatthey could leave a similar mark on logging tools used in ADL. The images 6.5and 6.6 show the extraction of a web page that indicates the data collectedby ADL at the end of a session launched by a (bot) client.
  • 113. 104 CHAPTER 6. TEST EVALUATION Figure 6.5: Visitor details In addition to information collected by typical web analitics software,such as visit time, IP, environment and other information concerning theclient, also the pages visited by clients were collected for every single session.For each session, as shown in the image 6.6, the visit pattern was recorded,
  • 114. 6.2. TEST RESULTS 105page by page, with respective time of staying on all of them. The latterinformation is very useful to discriminate between a human and a bot client.Since all this displayed information is stored in ADL database, if deemednecessary,it is possible to search back through all clients who have used thesame pattern of navigation, though it could be possible to update the trustdatabase with information coming from the last detected attack. Figure 6.6: Visitor ClickStream
  • 115. 106 CHAPTER 6. TEST EVALUATION Finally, since the selenium sessions dont completely reproduce the cursormovement, no maps associated with each session launched by the client arerecorded.6.3 Test SummaryIn order to help the reader to understand better the test results reported inthe previous paragraph, wed like to summarize below the most signicantresults obtained for each size of botnets (as described in Table 6.1). In each table every rst reported experiment expresses the values obtainedwith smartproxy o; while every second experiments - the values recordedwith the best performant conguration of the obtained smartproxy. In experiments with small botnet the performance values were obtainedin test This experiment reected the conguration with 30 clients inthe botnet known and 15 clients in the botnet unknwown; with smartproxyon with buer = 1 and maximum bandwidth of 32Kbps. The obtained datain test conguration with class derating and more eective ltering ofrequests from botnet unknown are virtually similar, but slightly worse. Thisis due to the lack of size dierence of these two botnets. In Table 6.24 it is possible to see that there is a palpable improvementon all fronts. The successfully completed sessions by legitimate client passfrom 60 to 100%, the response time (latency) is lowered from 1.807 to 0.484seconds. The web server CPU load slows down from an average of 73.56%to 42.11%, and the metrics on QoS measured with dosmetric shown a clearan improvement in performance. test sessions [%] (1) avg. latency [sec] WS CPU [%] clients QoS 60 1.807 73.56 0.882054 100 0.484 42.11 0.934500Table 6.24: Best test results with botnet small.(1) Percentage of succesfully completed sessions from the client. The experiment with medium-size botnet has given very good results,especially those after the rst attack in test (Table 6.25). In thisconguration the smartproxy assigned medium priority to botnet unknown,buer = 1, band = 64Kbps - and to botnet Known - low priority, buer =1, band = 32Kbps. The gure for the sessions completed by legitimate client is not very signif-icant when it is correlated with the found latency. In fact, in the experiment
  • 116. 6.3. TEST SUMMARY 1070.3.3.0 without smartproxy, the measured latency has grown to more than10 seconds. The observed high response time is explained by looking at theaverage value of CPU that nearly 90%, thus it shows a remarkable situationof web server overload. test sessions [%] (1) avg. latency [sec] WS CPU [%] clients QoS 100 10.464 86.34 0.904182 100 0.410 38.99 0.947501Table 6.25: Best test results with botnet medium(1) Percentage of succesfully completed sessions from the client. Finally, while scaling the experiment up to the largest conguration with120 PCs coming from the known botnet and 30 PCs coming from the botnetunknown, we can still see some improvements of response time and QoSexperimented by legitimate client. In particularly, in Table 6.26 we can seethat the measured latency without smartproxy is almost 20 seconds, whilewith smartproxy on it drops to just over 12 seconds. QoS measured withdosmetric grows to a value comparable to the one seen with the smallerbotnet. test sessions [%] (1) avg. latency [sec] WS CPU [%] clients QoS 100 19.262 94.86 0.923643 100 12.704 95.26 0.945134Table 6.26: Best test results with botnet large(1) Percentage of succesfully completed sessions from the client. The results show the eectiveness of the proposed solution, which providessome improvements in response time and QoS on legitimate client, rightfuleven when the server is under the attack from a botnet of 150 computers,without ever made explicitly drop of packets.
  • 118. Chapter 7Conclusion and Future Work7.1 ConlusionIn this study we have proposed a new cross-cutting approach to the problemof DDoS attack mitigation on the application level. The rst contribution of this thesis is a comprehensive survey of knownDDoS defense strategies proposed in the scientic literature, as well as thedescription of recent research projects that advocate collaborative defensestrategies. Thanks to this wide view of the cyber crime problem it waspossible to dene the model we proposed and described in Section 3.2. Inthis model we have done very important steps in describing the rst solutionthat integrates a ne prioritization system for request on the basis of sharedtrust. The proposed solution oers tools for the recognition of cloned sessionssimultaneously replicated on multiple bots and enables the development oftools for identifying the source of attack (although the actual implementationremains an open issue). The model we proposed encompasses several areas of research. For prac-tical reasons (feasibility) we simplied the prototype, which has since beenimplemented. The simplied model is described in detail in Chapter 4. Eachof its key components has been implemented rst locally and then on thecloud, as described in chapter 5. The prototype tests were performed on alarge scale in the DETERlab emulation environment. With this environmentwe were able to ensure the scalability of our solution by testing it under theattack of botnets composed by up to 150 physical machines. The results,reported in chapter 6, demonstrate the eectiveness of the proposed solu-tion, which provides signicant improvements in response time and QoS asperceived by legitimate clients even when the server is under the attack ofbotnet of 150 computers. 109
  • 119. 110 CHAPTER 7. CONCLUSION AND FUTURE WORK Our extensive evaluation using DETERlab provided us with a number ofvaluable insights regarding the evaluation methodology. In highly customizable outsourced environments such DETERlab, eventhe most trivial problems that can be easily solved on a local system maybecome dicult to overcome. The most critical factor here is the fact thatwe cannot physically access the servers in any way. In general, the utilizationof an outsourced evaluation environment makes it even more dicult to an-swer the classic question for a system administrator: Did I make a mistake,or does the encountered issue result from an infrastructure bug?. However,we believe that such features as scalability and repeatability of DETERlabexperiments enables invaluable demonstrations of the correctness and eec-tiveness of the proposed solution, and they may serve as a basis for thedevelopment of other similar alternatives. Finally, we have proposed a set of tools, in part demonstrating theireectiveness, to facilitate the retrieval of useful information relevant for theidentication of the source of attack. Thanks to these tools it is possible tokeep updated the trust given to suspicious clients which is held in the shareddatabase. We believe that the information obtained from previous cyberattacks is very important to improve our ability to defend against futureattacks. Sharing this information can be a cornerstone as in this case theInternet could become more secure.7.2 Future WorkDespite our proposal entails signicant progress in defending against exist-ing DDoS attacks, some of its aspects have not been adequately detailed inthis study due to limited resources. It would be interesting to develop thefollowing points in the future: • Automate the extraction of all the sources that have launched the same pattern of attack detected by the activation of post-mortem auditing. • Test and verify the detection of malicious actions described in the initial model. • Implement and test client identication mechanisms that are not only based on the IP address, as described in [43]. • Verify the possibility of introducing techniques to assign positive trust to veried legitimate clients.
  • 120. 7.2. FUTURE WORK 111 • Integration and verication of Pushback[4]/HPB[40] technique adop- tion during an attack (discussed in paragraphs 3.2.2 and 4.2.1) together with the prioritization system to keep controlled the actions coming from clients with the lowest trust level. • Repeat tests for selenium sessions both with heavy and light requests load, instead of a single type mixed session as has been done in this study.Finally, it would be important to evaluate the proposed solution in a pro-duction network, thus overcoming the limitations that are inherent to theexperimental testbed used in this work. Similar tests would also allow theintegration of information coming from similar collaborative solutions suchas those described in paragraph 2.3. A step in this direction could be takenby integrating the approach proposed in this thesis with future developmentsof other collaborative defense projects, such as WOMBAT and CoMiFIN.
  • 122. 113
  • 123. 114 APPENDIX A. CUSTOM SCRIPTSAppendix ACustom Script sA.1 Bots ControlA.1.1 batchlib.cfgAlgorithm A.1 batchlib.cfg sw=/proj/ALDADAMS/sw/ source $sw/bot/global.cfg #rname= $1 #Parametri da impostare allo swapin dellesperimento #WS ethws=eth107 #SmartProxy INcoming ethsp_ext=eth3 ethsp_lan=eth0 #client interna ethcl=eth94 #################################################### unset known known=kbot1 unset uknown uknown=ukbot1 ################################################### ## Impostare il numero di bots nella botnet Known unset nknown nknown=121 #num bot +1 ################################################### ## Impostare il numero di bots nella botnet un-Known unset nuknown nuknown=31 #num bot +1 ################################################### unset i for (( i = 1 ; i nknown ; i++ )) do known=( ${known[@]} kbot$(( i+1 )) ) done ##################################################
  • 124. A.1. BOTS CONTROL 115 unset i for (( i = 1 ; i nuknown ; i++ )) do uknown=( ${uknown[@]} ukbot$(( i+1 )) ) done function rr { ssh $1.$exp $2 echo $1 $2 OK } function knownexec { unset i for (( i=0;i$nknown-1;i++)); do rr ${known[${i}]} $1 done } function unknownexec { #echo $1 unset i for (( i=0;i$nuknown-1;i++)); do rr ${uknown[${i}]} $1 # echo ${uknown[${i}]} OK done } function allbots { unknownexec $1 knownexec $1 } function botd_start { allbots sudo /etc/init.d/botd start } function botd { allbots sudo /etc/init.d/botd $1 } function end_exp { source $sw/bot/global.cfg rr client1 pkill client-session rr client1 sudo pkill latency rr sp sudo pkill wt rr ws sudo pkill wt rr sp sudo pkill tcpdump
  • 125. 116 APPENDIX A. CUSTOM SCRIPTS rr sp sudo mv /tmp/*.pcap /users/davide/log/ rr sp sudo pkill sar rr sp sudo mv /tmp/*.cpu* /users/davide/log/ rr sp $sw/ $ethsp_ext /users/davide/log/$ns-sp_ext-post--$ rr sp $sw/ $ethsp_lan /users/davide/log/$ns-sp_lan-post--$ rr ws sudo pkill sar rr ws sudo mv /tmp/*.cpu* /users/davide/log/ rr client1 sudo pkill tcpdump # rr client1 pkill java rr hub pkill run_exp #rr sp sudo rm /tmp/* #rr ws sudo rm /tmp/* # rr fwsudo chown -R davide:ALDADAMS /users/davide/log/* } function kill_client { rr client1 pkill client-session rr client1 sudo pkill latency rr client1 sudo pkill tcpdump sleep 1 rr client1 pkill java } function kill_exp { if [ -z $1 ]; then botd stop end_exp else botd purge end_exp kill_client fi } function run_exp { source $sw/bot/global.cfg rr client1 $sw/ $rname $ethcl pidcl=$! botd start #Monitoro la backbone che arriva allo sp (verso il fw) rr sp $sw/ $rname $ethsp_ext /dev/null 21 rr sp $sw/ $rname $ethsp_ext _ext /dev/null 21 rr sp $sw/ $rname $ethsp_lan _lan /dev/null 21
  • 126. A.1. BOTS CONTROL 117 #rr sp $sw/ $ethsp #rr sp $sw/ $ethsp rr sp $sw/ $ethsp_ext /users/davide/log/$ns-sp_ext-pre--$ rr sp $sw/ $ethsp_lan /users/davide/log/$ns-sp_lan-pre--$ # Monitor WebServer cpu/network rr ws $sw/ $rname $ethws /dev/null 21 wait $pidcl sleep 10 kill_exp } function clean_and_reboot { allbots $sw/ allbots sudo reboot } function smartp { if [ $1 = on ] then # attiva le policy sullo smartproxy #rr sp $sw/ #sleep 1 rr sp $sw/ elif [ $1 = off ] then # disattiva regole sullo smartproxy rr sp $sw/ fi } function run_exp_bg { rr hub run_exp /dev/null 21 }
  • 127. 118 APPENDIX A. CUSTOM SCRIPTSA.1.2 wt.shAlgorithm A.2 Wire-Tapping Network trac dumping #!/bin/bash # $1 : name experiment # $2 : eth if not default source /proj/ALDADAMS/sw/bot/global.cfg dir=/tmp #get nodes url url=`uname -n` #get hostname node=(`echo $url | tr . n`) tstamp=`date +%F_%H-%M` if [ -z $2 ] then eth= else eth= -i $2 fi sudo tcpdump $eth -nnvvvS -s 0 -U -w $dir/t4-$node-$tstamp-$rname$3.pcap
  • 128. A.1. BOTS CONTROL 119
  • 129. 120 APPENDIX A. CUSTOM SCRIPTSA.1.3 Latency.shAlgorithm A.3 #!/bin/bash tstamp=`date +%F_%H-%M` dir=/users/davide/log sw=/proj/ALDADAMS/sw source $sw/bot/global.cfg wget_par_heavy=(wget -N --no-cache --delete-after --quiet http://WS/joomla/index.php?searchword= joomla+exampleordering=popularsearchphrase=alloption=com_search ) wget_par_light=(wget -N --no-cache --delete-after --quiet http://WS/joomla/index.php?option=com_content view=articleid=22Itemid=29 ) #get nodes url url=`uname -n` #get hostname node=(`echo $url | tr . n`) function latency_H { $dir/$ns-$node-$tstamp-$rname.latency_heavy $@ /usr/bin/time --format=%E -a -o $dir/$ns-$node-$tstamp-$rname.latency_heavy $@ } function latency_L { /usr/bin/time --format=%E -a -o $dir/$ns-$node-$tstamp-$rname.latency_light $@ } interval=5 t_start=$(date +%s) while [ 1 ] do #Count running time unset t_now t_now=$(date +%s) unset t_running let t_running=($t_now-$t_start)/60 #check experiment lifetime if [ $t_running -eq $exp_last ] ; then echo Experiment time expired after $t_running minutes exit 0 else echo Launching wget then wait... latency_H ${wget_par_heavy[@]} sleep $interval latency_L ${wget_par_light[@]} sleep $interval fi done
  • 130. A.1. BOTS CONTROL 121
  • 131. 122 APPENDIX A. CUSTOM SCRIPTSA.1.4 botd-known.shAlgorithm A.4 #!/bin/bash #Name of the run run rname=$1 # var display export DISPLAY=:99.0 #file comandi commands=/proj/ALDADAMS/sw/bot/known.cfg #File conf globale global=/proj/ALDADAMS/sw/bot/global.cfg #get nodes url url=`uname -n` #get hostname hn=(`echo $url | tr . n`) run=`date +%F_%H-%M` dir=/users/davide/ss/$run sw=/proj/ALDADAMS/sw/ mkdir $dir function traffic() { java -jar /opt/selenium-grid-1.0.8/vendor/ selenium-server-1.0.3-standalone.jar -port $2 -firefoxProfileTemplate /users/davide/testing/ff/ -htmlSuite *firefox http://WS/joomla/ /users/davide/testing/TestSuite.html $dir/TestResults-$hn-$rname-$1.html #xwd -root -display :99.0 -out $dir/ff-$hn #convert $dir/ff-$hn $dir/ff-$hn.jpg } function runbot() { pids= echo sleeping for $delay secs... sleep $delay while [ $childs -gt 0 ] do echo running child: $childs #get time tstart=`date +%H-%M-%S_%N` let port=4444+$childs #set selenium servers port traffic $tstart $port #launch a selenium server child pids=$pids $! echo $tstart $port let childs=childs-1 #dec childs counter done waitall $pids } debug() { echo DEBUG: $* 2; }
  • 132. A.1. BOTS CONTROL 123 function readconf() { source $global source $commands } waitall() { # PID... ## Wait for children to exit and indicate ## whether all exited with 0 status. local errors=0 while :; do debug Processes remaining: $* for pid in $@; do shift if kill -0 $pid 2/dev/null; then debug $pid is still alive. set -- $@ $pid elif wait $pid; then debug $pid exited with zero exit status. else debug $pid exited with non-zero exit status. ((++errors)) fi done (($# 0)) || break # TODO: how to interrupt this sleep when a child terminates? sleep ${WAITALL_DELAY:-1} done ((errors == 0)) } t_start=$(date +%s) while [ 1 ] do if [ -f $commands ] ; then #Read conf file readconf #Count running time unset t_now t_now=$(date +%s) unset t_running let t_running=($t_now-$t_start)/60 #Check experiment lifetime if [ $t_running -eq $exp_last ] ; then echo Experiment time expired after $t_running minutes exit 0 else echo Launching $childs childs: wait... runbot
  • 133. 124 APPENDIX A. CUSTOM SCRIPTS echo $childs exits fine fi else sleep 0 #echo no exist fi done
  • 134. A.1. BOTS CONTROL 125
  • 135. 126 APPENDIX A. CUSTOM SCRIPTSA.1.5 Legitimate client session Figure A.1: Legitimate Client Session - html preview
  • 136. A.1. BOTS CONTROL 127
  • 137. 128 APPENDIX A. CUSTOM SCRIPTSA.2 DETERlabA.2.1 test-4.nsAlgorithm A.5 experiment.ns set ns [new Simulator] source tb_compat.tcl #control node SEER GUI e BorderRouter(fw) foreach node { control fw } { #Create new node set $node [$ns node] #Define the OS image tb-set-node-os [set $node] Ubuntu1004-ITA-A } #SmartProxy foreach node { sp } { #Create new node set $node [$ns node] #Define the OS image tb-set-node-os [set $node] SP-D } #HUB node foreach node { hub } { #Create new node set $node [$ns node] tb-set-node-os [set $node] HUB-D #Have SEER install itself and startup when the node is ready #tb-set-node-startcmd [set $node] sudo /proj/ALDADAMS/sw/runme.hudson } #Known botnet foreach node{ kbot1 kbot2 kbot3 kbot4 kbot5 kbot6 kbot7 kbot8 kbot9 kbot10 kbot11 kbot12 kbot13 kbot14 kbot15 kbot16 kbot17 kbot18 kbot19 kbot20 kbot21 kbot22 kbot23 kbot24 kbot25 kbot26 kbot27 kbot28 kbot29 kbot30 kbot31 kbot32 kbot33 kbot34 kbot35 kbot36 kbot37 kbot38 kbot39 kbot40 kbot41 kbot42 kbot43 kbot44 kbot45 kbot46 kbot47 kbot48 kbot49 kbot50 kbot51 kbot52 kbot53 kbot54 kbot55 kbot56 kbot57 kbot58 kbot59 kbot60 kbot61 kbot62 kbot63 kbot64 kbot65 kbot66 kbot67 kbot68 kbot69 kbot70 kbot71 kbot72 kbot73 kbot74 kbot75 kbot76 kbot77 kbot78 kbot79 kbot80 kbot81 kbot82 kbot83 kbot84 kbot85 kbot86 kbot87 kbot88 kbot89 kbot90 kbot91 kbot92 kbot93 kbot94 kbot95 kbot96 kbot97 kbot98 kbot99 kbot100 kbot101 kbot102 kbot103 kbot104 kbot105 kbot106 kbot107 kbot108 kbot109 kbot110 kbot111 kbot112 kbot113 kbot114 kbot115 kbot116 kbot117 kbot118 kbot119 kbot120 }
  • 138. A.2. DETERLAB 129 { #Create new node set $node [$ns node] #Define the OS image tb-set-node-os [set $node] kbot-D } # un-known botnet foreach node { ukbot1 ukbot2 ukbot3 ukbot4 ukbot5 ukbot6 ukbot7 ukbot8 ukbot9 ukbot10 ukbot11 ukbot12 ukbot13 ukbot14 ukbot15 ukbot16 ukbot17 ukbot18 ukbot19 ukbot20 ukbot21 ukbot22 ukbot23 ukbot24 ukbot25 ukbot26 ukbot27 ukbot28 ukbot29 ukbot30 } { #Create new node set $node [$ns node] #Define the OS image tb-set-node-os [set $node] ukbot-D } # Legitimate client foreach node { client1 } { #Create new node set $node [$ns node] #Define the OS image tb-set-node-os [set $node] client-D } #WS foreach node { WS } { #Create new node set $node [$ns node] #Define the OS image tb-set-node-os [set $node] WS-D } ## Nagios not included #Create the topology nodes (Custom OS) foreach node { ADS } { #Create new node set $node [$ns node] #Define the OS image tb-set-node-os [set $node] ADS-D } # Lans set knownbot [$ns make-lan $fw $hub kbot1 kbot2 kbot3 kbot4 kbot5 kbot6 kbot7 kbot8 kbot9 kbot10 kbot11 kbot12 kbot13 kbot14 kbot15 kbot16 kbot17 kbot18 kbot19 kbot20 kbot21 kbot22 kbot23 kbot24 kbot25 kbot26 kbot27 kbot28 kbot29 kbot30 kbot31 kbot32 kbot33 kbot34 kbot35 kbot36 kbot37
  • 139. 130 APPENDIX A. CUSTOM SCRIPTS kbot38 kbot39 kbot40 kbot41 kbot42 kbot43 kbot44 kbot45 kbot46 kbot47 kbot48 kbot49 kbot50 kbot51 kbot52 kbot53 kbot54 kbot55 kbot56 kbot57 kbot58 kbot59 kbot60 kbot61 kbot62 kbot63 kbot64 kbot65 kbot66 kbot67 kbot68 kbot69 kbot70 kbot71 kbot72 kbot73 kbot74 kbot75 kbot76 kbot77 kbot78 kbot79 kbot80 kbot81 kbot82 kbot83 kbot84 kbot85 kbot86 kbot87 kbot88 kbot89 kbot90 kbot91 kbot92 kbot93 kbot94 kbot95 kbot96 kbot97 kbot98 kbot99 kbot100 kbot101 kbot102 kbot103 kbot104 kbot105 kbot106 kbot107 kbot108 kbot109 kbot110 kbot111 kbot112 kbot113 kbot114 kbot115 kbot116 kbot117 kbot118 kbot119 kbot120 100Mb 0ms] set unknownbot [$ns make-lan $fw $hub ukbot1 ukbot2 ukbot3 ukbot4 ukbot5 ukbot6 ukbot7 ukbot8 ukbot9 ukbot10 ukbot11 ukbot12 ukbot13 ukbot14 ukbot15 ukbot16 ukbot17 ukbot18 ukbot19 ukbot20 ukbot21 ukbot22 ukbot23 ukbot24 ukbot25 ukbot26 ukbot27 ukbot28 ukbot29 ukbot30 100Mb 0ms] set internet [$ns make-lan $fw $client1 100Mb 0ms] set frontend [$ns make-lan $fw $sp 1000Mb 0ms] set backend [$ns make-lan $ADS $sp $WS 100Mb 0ms] #$backend trace monitor #$internet trace monitor #$selenium trace monitor $ns rtproto Static $ns run
  • 140. A.2. DETERLAB 131
  • 141. 132 APPENDIX A. CUSTOM SCRIPTSA.2.2 Nodess cluster deployment Table A.1: Nodes deployment 1/3
  • 142. A.2. DETERLAB 133 Table A.2: Nodes deployment 2/3
  • 143. 134 APPENDIX A. CUSTOM SCRIPTS Table A.3: Nodes deployment 3/3
  • 144. A.2. DETERLAB 135
  • 145. 136 APPENDIX A. CUSTOM SCRIPTSA.3 SmartProxyA.3.1 tc.c (before DDoS)Algorithm A.6 tc.c before rst attack #include #include #include $client =; $ws =; $ukbot =; $kbot =; dev eth0 { egress { // classification class ($top) ; class ($high) if (ip_src == $client ip_dst == $ws) || (ip_src:24 == $ukbot ip_dst == $ws); ; class ($medium) ; class ($low) if (ip_src:24 == $kbot ip_dst == $ws) ; class ($fallback) if 1; htb () { class (rate 100000 kbps) { // legitimate - unknown src traffic $top = class ( prio 1, rate 20 Mbps ) { fifo (limit 1000p); }; // ukbot for charts $high = class ( prio 2, rate 20 Mbps ) { fifo (limit 1000p); }; //fallback - the unfiltered traffic $fallback = class ( prio 3, rate 10 Mbps) { fifo (limit 1000p); }; // ukbot after first attack $medium = class ( prio 4, rate 64 kbps) { fifo (limit 5p); }; // well known attackers - kbot $low = class (prio 5, rate 32 kbps ) { fifo (limit 3p); }; } } } }
  • 146. A.3. SMARTPROXY 137A.3.2 TC rules# ================ Device eth0 ================= tc qdisc add dev eth0 handle 1:0 root dsmark indices 8 default_index 0 tc qdisc add dev eth0 handle 2:0 parent 1:0 htb tc class add dev eth0 parent 2:0 classid 2:1 htb rate 12500000bps tc class add dev eth0 parent 2:1 classid 2:2 htb rate 2500000bps prio 1 tc qdisc add dev eth0 handle 3:0 parent 2:2 pfo limit 1000 tc class add dev eth0 parent 2:1 classid 2:3 htb rate 2500000bps prio 2 tc qdisc add dev eth0 handle 4:0 parent 2:3 pfo limit 1000 tc class add dev eth0 parent 2:1 classid 2:4 htb rate 1250000bps prio 3 tc qdisc add dev eth0 handle 5:0 parent 2:4 pfo limit 1000 tc class add dev eth0 parent 2:1 classid 2:5 htb rate 8000bps prio 4 tc qdisc add dev eth0 handle 6:0 parent 2:5 pfo limit 5 tc class add dev eth0 parent 2:1 classid 2:6 htb rate 4000bps prio 5 tc qdisc add dev eth0 handle 7:0 parent 2:6 pfo limit 3 tc lter add dev eth0 parent 2:0 protocol all prio 1 tcindex mask 0x7 shift 0 tc lter add dev eth0 parent 2:0 protocol all prio 1 handle 5 tcindex classid 2:4 tc lter add dev eth0 parent 2:0 protocol all prio 1 handle 4 tcindex classid 2:6 tc lter add dev eth0 parent 2:0 protocol all prio 1 handle 3 tcindex classid 2:5 tc lter add dev eth0 parent 2:0 protocol all prio 1 handle 2 tcindex classid 2:3 tc lter add dev eth0 parent 2:0 protocol all prio 1 handle 1 tcindex classid 2:2 tc lter add dev eth0 parent 1:0 protocol all prio 1 u32 match u32 0xa010303 0x at 12 match u32 0xa010104 0x at 16classid 1:2 tc lter add dev eth0 parent 1:0 protocol all prio 1 u32 match u32 0xa010400 0x00 at 12 match u32 0xa010104 0x at 16classid 1:2 tc lter add dev eth0 parent 1:0 protocol all prio 1 u32 match u32 0xa010500 0x00 at 12 match u32 0xa010104 0x at 16classid 1:4 tc lter add dev eth0 parent 1:0 protocol all prio 1 u32 match u32 0x0 0x0 at 0 classid 1:5
  • 147. 138 APPENDIX A. CUSTOM SCRIPTSA.3.3 tc.c (Post DDoS)Algorithm A.7 tc.c after rst attack #include #include #include $client =; $ws =; $ukbot =; $kbot =; dev eth0 { egress { // classification class ($top) ; class ($high) if (ip_src == $client ip_dst == $ws) ; class ($medium) if (ip_src:24 == $ukbot ip_dst == $ws); ; class ($low) if (ip_src:24 == $kbot ip_dst == $ws) ; class ($fallback) if 1; htb () { class (rate 100000 kbps) { // legitimate - unknown src traffic $top = class ( prio 1, rate 20 Mbps ) { fifo (limit 1000p); }; // ukbot for charts $high = class ( prio 2, rate 20 Mbps ) { fifo (limit 1000p); }; //fallback - the unfiltered traffic $fallback = class ( prio 3, rate 10 Mbps) { fifo (limit 1000p); }; // ukbot after first attack $medium = class ( prio 4, rate 64 kbps) { fifo (limit 1p); }; // well known attackers - kbot $low = class (prio 5, rate 32 kbps ) { fifo (limit 1p); }; } } } }
  • 148. Bibliography[1] T. Peng, C. Leckie, and K. Ramamohanarao. Survey of network-based defense mechanisms countering the DoS and DDoS problems. ACM Computing Survey, 39(1):3, 2007.[2] Xin Liu, Xiaowei Yang, and Yanbin Lu. To Filter or to Authorize: Network-Layer DoS Defense Against Multimillion-node Botnets. In Pro- ceedings of ACM SIGCOMM, August 2008.[3] K. Argyraki and D. R. Cheriton. Scalable Network-layer Defense Against Internet Bandwidth-Flooding Attacks. To appear in ACM/IEEE ToN.[4] R. Mahajan, S. M. Bellovin, S. Floyd, J. Ioannidis, V. Paxson, and S. Shenker. Controlling High Bandwidth Aggregates in the Network. SIGCOMM CCR, 32(3), 2002.[5] A. Keromytis, V. Misra, and D. Rubenstein. SOS: An Architecture for Mitigating DDoS Attacks. IEEE JSAC, 22(1), 2004.[6] B. Parno, D. Wendlandt, E. Shi, A. Perrig, B. Maggs, and Y.-C. Hu. Portcullis: Protecting Connection Setup from Denial-of-Capability At- tacks. In ACM SIGCOMM, 2007.[7] A. Yaar, A. Perrig, and D. Song. SIFF: A Stateless Internet Flow Filter to Mitigate DDoS Flooding Attacks. In IEEE Symposium on SP, 2004.[8] X. Yang, D. Wetherall, and T. Anderson. TVA: A DoS-limiting Network Architecture. In IEEE/ACM Transactions on Networking (to appear), 2009.[9] K. Argyraki and D. R. Cheriton. Network Capabilities: The Good, the Bad and the Ugly. In ACM HotNets-IV, 2005. 139
  • 149. 140 BIBLIOGRAPHY[10] Bailey, M.; Cooke, E.; Jahanian, F.; Yunjing Xu; Karir, M.; A Survey of Botnet Technology and Defenses. Conference For Homeland Secu- rity, 2009. CATCH 09. Cybersecurity Applications Technology Dig- ital Object Identier: 10.1109/CATCH.2009.40 Publication Year: 2009 , Page(s): 299 - 304.[11] Walsh, R.; Lapsley, D.; Strayer, W.T.; Eective Flow Filtering for Bot- net Search Space Reduction. Conference For Homeland Security, 2009. CATCH 09. Cybersecurity Applications Technology Digital Object Identier: 10.1109/CATCH.2009.22 Publication Year: 2009 , Page(s): 141 - 149[12] J. Mirkovic, J. Martin, P. Reiher, A taxonomy of DDoS attacks and DDoS defense mechanisms, UCLA CSD Technical Report no. 020018.[13] Tao Peng , Christopher Leckie , Kotagiri Ramamohanarao, Survey of network-based defense mechanisms countering the DoS and DDoS prob- lems, ACM Computing Surveys (CSUR), v.39 n.1, p.3-es, 2007.[14] D. Moore, G. Voelker, and S. Savage. Inferring Internet Denial-of-Service Activity. In Proceedings of the 2001 USENIX Security Symposium, 2001.[15] Noureldien A. Noureldien Mashair O. Hussein Department of Computer Science University of Science and Technology Omdurman - Sudan, Block Spoofed Packets at Source (BSPS): A method for Detecting and Prevent- ing All Types of Spoofed Source IP Packets and SYN Flooding Packets at Source: A Theoretical Framework, IEEE 2009.[16] Jelena Mirkovic, Aleya Hussain, Brett Wilson, Sonia Fahmy, Peter L. Reiher, Roshan K. Thomas, Wei-Min Yao, Stephen Schwab: To- wards user-centric metrics for denial-of-service measurement. Experi- mental Computer Science 2007: 8[17] J. Mirkovic, S. Fahmy, P. Reiher and R. Thomas: How to Test DDoS Defenses. Proceedings of the Cybersecurity Applications Technology Conference For Homeland Security (CATCH) 2009.[18] G. Oikonomou, J. Mirkovic, P. Reiher and M. Robinson: A Framework for Collaborative DDoS Defense. Proceedings of the Annual Computer Security Applications Conference (ACSAC), 2006.[19] K. Argyraki and D. R. Cheriton, Network capabilities: The good, the bad and the ugly, in Proc. ACM HotNets, 2005.
  • 150. BIBLIOGRAPHY 141[20] LIU, X., LI, A., YANG, X., AND WETHERALL, D. 2008. Passport: Secure and adoptable source authentication. In Proceedings of USENIX Symposium on Networked Systems Design and Implementation.[21] M.Walsh, H. Balakrishnan, D. Karger, S. Shenker, DDoS Defense by Oense presented at the ACM SIGCOMM 2006, Pisa, Italy, September 2006.[22] Ranjan, Supranamaya and Swaminathan, Ram and Uysal, Mustafa and Nucci, Antonio and Knightly, Edward, DDoS-shield: DDoS-resilient scheduling to counter application layer attacks, IEEE/ACM Trans. Netw. 17/1 ,2009.[23] Atterer, Richard and Wnuk, Monika and Schmidt, Albrecht, Knowing the users every move: user activity tracking for website usability eval- uation and implicit interaction, WWW 06: Proceedings of the 15th international conference on World Wide Web, 2006, ACM.[24] Wang, Haining and Jin, Cheng and Shin, Kang G., Defense against spoofed IP trac using hop-count ltering, IEEE/ACM Trans. Netw.,2007.[25] Schridde, Christian and Smith, Matthew and Freisleben, Bernd, TrueIP: prevention of IP spoong attacks using identity-based cryp- tography, SIN 09: Proceedings of the 2nd international conference on Security of information and networks, 2009.[26] M. Aron, D. Sanders, P. Druschel, and W. Zwaenepoel, Scalable content- aware request distribution in cluster-based network servers, presented at the USENIX Annual Technical Conf., San Diego, CA, Jun. 2000.[27] C. Amza, A. Cox, and W. Zwaenepoel, Conict-aware scheduling for dynamic content applications, presented at the 4th USENIX Symp. Internet Technologies and Systems (USITS), Seattle, WA, Mar. 2003.[28] I. Csiszar, The method of types IEEE Trans. Inf. Theory, vol. 44, pp. 25052523, 1998.[29] S. Katti and B. Krishnamurthy. Collaborating Against Common Ene- mies. In Proceedings of the ACMSIGCOMM/USENIX Internet Mea- surement Conference, October 2005.
  • 151. 142 BIBLIOGRAPHY[30] MIrco Marchetti, Michele Messori, Michele Colajanni, Peer-to-Peer Ar- chitecture for Collaborative Intrusion and Malware Detection on a Large Scale, ISC 09: Proceedings of the 12th International Conference on In- formation Security, 2009.[31] Hamza Ghani, Abdelmajid Khelil, Neeraj Suri, György Csertán, Lás- zló Gönczy, Gabor Urbanics, James Clarke, Assessing the Security of Internet Connected Critical Infrastructures (The CoMiFin Project Ap- proach), In Proc. SecIoT 2010 (to appear).[32] A. Hussain, J. Heidemann and C. Papadopoulos, Identication of re- peated denial of service attacks, in: Proceedings of IEEE INFOCOM 2006, Barcelona, Spain, 2006.[33] Chen, H. and Chen, Y. 2008. A Novel Embedded Accelerator for Online Detection of Shrew DDoS Attacks. In Proceedings of the 2008 interna- tional Conference on Networking, Architecture, and Storage (June 12 - 14, 2008). NAS. IEEE Computer Society, Washington, DC, 365-372.[34] Maciá-Fernández, G., Díaz-Verdejo, J. E., and García-Teodoro, P. 2009. Mathematical model for low-rate DoS attacks against application servers. Trans. Info. For. Sec. 4, 3 (Sep. 2009), 519-529.[35] G. Maciá-Fernández, J. E. Díaz-Verdejo, and P. Garcia-Teodoro, LoR- DAS: A low-rate DoS attack against application servers, in Proc. CRITIS07, 2008, vol. 5141, LNCS, pp. 197-209.[36] Wei, S. and Mirkovic, J. 2007. Building Reputations for Internet Clients. Electron. Notes Theor. Comput. Sci. 179 (Jul. 2007), 17-30.[37] Dacier, M., Pham, V., and Thonnard, O. 2009. The WOMBAT Attack Attribution Method: Some Results. In Proceedings of the 5th inter- national Conference on information Systems Security (Kolkata, India, December 14 - 18, 2009). A. Prakash and I. Sen Gupta, Eds. Lecture Notes In Computer Science, vol. 5905. Springer-Verlag, Berlin, Heidel- berg, 19-37.[38] Dacier, Marc; Leita, Corrado; Thonnard, Olivier; Pham, Van-Hau; Kirda, Engin. Assessing cybercrime through the eyes of the WOMBAT. Chapter 3 of Cyber Situational Awareness : Issues and Research, Springer International Series on Advances in Information Security, 2009.[39] Corrado Leita, Ulrich Bayer, Engin Kirda, Exploiting diverse observa- tion perspectives to get insights on the malware landscape, International
  • 152. BIBLIOGRAPHY 143 Conference on Dependable Systems and Networks (DSN 2010), Chicago, June 2010.[40] Jian Zhang , Phillip Porras , Johannes Ullrich, Highly predictive black- listing, Proceedings of the 17th conference on Security symposium, p.107-122, July 28-August 01, 2008, San Jose, CA.[41] Brett Stone-Gross , Christopher Kruegel , Kevin Almeroth , Andreas Moser , Engin Kirda, FIRE: FInding Rogue nEtworks, Proceedings of the 2009 Annual Computer Security Applications Conference, p.231-240, December 07-11, 2009.[42] R. Baldoni, G. Csertain, H. Elshaa, L. Gonczy, G. Lodi, B. Mulc- ahy Trust Management in Monitoring Financial Critical Information In- frastructures, The 2nd International Conference on Mobile Lightweight Wireless Systems- Critical Information Infrastructure Protection Track, 2010[43] Peter Eckersley, Electronic Frontier Foundation - How Unique Is Your Web Browser?, 2010.[44] Khattab, S., Sangpachatanaruk, C., Moss, D., Melhem, R. and Znati, T. (2004) Roaming Honeypots for Mitigating Service- Level Denial- of-Service Attacks. Proc. Int. Conf. Distributed Computing Systems (ICDCS04), Washington, DC, USA, pp. 328337. IEEE Computer So- ciety, NewYork, NY, USA.[45] Chen, Y. and Hwang, K. (2006) Collaborative detection and ltering of shrew DDoS attacks using spectral analysis. J. Parallel Distrib. Com- put., 66, 11371151.[46] Siris, V. and Papagalou, F. (2004) Application of Anomaly Detection Al- gorithms for Detecting SYN Flooding Attacks. Proc. GLOBECOM04, Dallas, TX, USA, November 29 December 3, pp. 20502054.[47] Leu, F. and Yang, W. (2005) Intrusion detection with CUSUM for TCP- based DDoS. Lect. Notes Comput. Sci., 3823, 1255 1264.[48] Manuel Egele, Leyla Bilge, Engin Kirda, and Christopher Kruegel. 2010. CAPTCHA smuggling: hijacking web browsing sessions to cre- ate CAPTCHA farms. In Proceedings of the 2010 ACM Symposium on Applied Computing (SAC 10).[49]
  • 153. 144 BIBLIOGRAPHY[50] 4chan launches DDoS against entertainment industry -[51][52] Morris Worm:[53] DDoS mitigation expert predicts more serious application-layer attacks. article/0,289142,sid14_gci1525260,00.html[54] pirate-parties-call-for-operation-payback-ceasere - Operation payback. -[55] Tis the Season of DDoS WikiLeaks Edition - editio/[56] K. Wooding. Magnication Attacks - Smurf, Fraggle, and Others.[57] Web Services Description Language (WSDL),[58] Universal Description Discovery and Integration , Universal_Description_Discovery_and_Integration.[59] Robots exclusion standard,[60] IETF RFC 4732: Internet Denial-of-Service Considerations Rescorla, Ed - 2006.[61] CERT Advisory CA-1997-28, IP Denial-of-Service Attacks., 2010.[62] Linux webserver botnet pushes malware:[63][64] DDoS_Attacks_Are_Back_and_Bigger_Than_Before
  • 154. BIBLIOGRAPHY 145[65] JPgraph:[66] PHP-Layers Menu:[67] SMT2:[68] MySQL:[69][70] Apache:[71][72] JOOMLA:[73] Open Web Analytics (OWA):[74] MasterShaper:[75] DETERlab:[76] Xvfb:[77] SeleniumHQ:[78] Ubuntu:[79] Wikipedia-Ubuntu:[80] Firefox:[81] TCPdump:[82] J4AGE:[83] Sysstat/SAR:[84] TC:[85] TCNG:[86] TCNG manual:[87] iproute2:
  • 155. 146 BIBLIOGRAPHY[88] Mod HTML: code/5435/details[89] UsaPROXY: