Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Global Botnet Detector

607 views

Published on

In a world where most of the internet traffic is produced by bots, who will defend the innocent from the relentless onslaught of malicious botnet activity?

Everyday, countless incidents of botnet activity occur all around the web; wreaking havoc in the form of mass security breaches, data scraping, fraudulent activity and DDoS attacks. The first step in the defense against botnets is to know when suspicious activity is taking place.

This talk covers: what a botnet is, how they work, and walks through a technique we are developing at Distil Networks to identify the presence of a botnet and a list of responsible participants. The botnet identification method described utilizes a correlation in traffic on a customer’s site, along with user fingerprinting, to first alert when a botnet is present and then identify key players.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Global Botnet Detector

  1. 1. GLOBAL BOTNET DETECTOR BRENTON MALLEN
  2. 2. ABOUT ME HI THERE! Data Scientist at Distil Networks Email: brenton.mallen@distilnetworks.com Blog: carpefridiem.wordpress.com Twitter: @BrentonMallen 2
  3. 3. MOTIVATION WHAT’S MY MOTIVATION? ▸ Product of an investigation of a DDoS attack on a customer ▸ Wanted a means to be alerted ▸ Wanted a means to identify a group of users potentially responsible 3
  4. 4. BOTS & BOTNETS WHAT ARE THESE BOTS I KEEP HEARING ABOUT? ▸ Automated code that pretends to be human ▸ Used to traverse them internets ▸ Not all bots are bad 4
  5. 5. BOTS & BOTNETS IS A BOT REALLY ALL THAT DANGEROUS? ▸ Botnets can cause damage: ▸ DDoS ▸ Mass Security Breaches ▸ Mass Data Theft 5
  6. 6. ANALOGY HOW ABOUT AN ANALOGY? 6
  7. 7. ANALOGY HOW ABOUT AN ANALOGY? 7
  8. 8. ANALOGY Casual Traffic Botnet Traffic 8 HOW ABOUT AN ANALOGY?
  9. 9. BOTNET DETECTOR WHAT ARE THE GOALS OF A BOTNET DETECTOR? ▸ Detect ▸ Presence of a Botnet ▸ Identify ▸ List of Suspects 9
  10. 10. BOTNET DETECTOR WHAT TOOLS DO WE USE? ▸ Python ▸ Boto ▸ Numpy ▸ AWS ▸ Hadoop ▸ Hive ▸ M-R Streaming 1.25 Billion Logs = 600 GB of Data per Day 10
  11. 11. BOTNET DETECTOR HOW DO WE DETECT A BOTNET? ▸ Part 1: Detect - For a given site, for each time window: AGGREGATE COUNTRY TRAFFIC CHECK FOR COORDINATED TRAFFIC PRODUCE ALERT 11
  12. 12. SITE TRAFFIC WHAT DOES THE TRAFFIC LOOK LIKE? Time RequestCount
  13. 13. BOTNET DETECTION CROSS-COUNTRY CORRELATION TIME WINDOW Country (A-Z) Country(A-Z) 1.0 0.0 -1.0 13
  14. 14. BOTNET DETECTION CROSS-COUNTRY CORRELATION TIME WINDOW Country(A-Z) Country (A-Z) 1.0 0.0 -1.0 14
  15. 15. BOTNET DETECTION CROSS-COUNTRY CORRELATION TIME WINDOW Country(A-Z) Country (A-Z) 1.0 0.0 -1.0 15
  16. 16. BOTNET DETECTION CROSS-COUNTRY CORRELATION TIME WINDOW Country(A-Z) Country (A-Z) 1.0 0.0 -1.0 16
  17. 17. BOTNET DETECTION CROSS-COUNTRY CORRELATION TIME WINDOW Country(A-Z) Country (A-Z) 1.0 0.0 -1.0 17
  18. 18. BOTNET DETECTION CASUAL TRAFFIC Country (A-Z) Country(A-Z) Country (A-Z) SUSPICIOUS TRAFFIC 1.0 0.0 -1.0 CROSS-COUNTRY CORRELATION 18
  19. 19. CORRELATION COEFFICIENT 2D-HISTOGRAM BOTNET DETECTION 0 —> -1.0 0.0 1.0 CASUAL TRAFFIC Site A Site B Time Time 19
  20. 20. BOTNET DETECTION CORRELATION COEFFICIENT 2D-HISTOGRAM 0 —> -1.0 0.0 1.0 CASUAL TRAFFIC Site A Site B Time Time 20
  21. 21. -1.0 0.0 1.0 BOTNET DETECTION CORRELATION COEFFICIENT 2D-HISTOGRAM 0 —> CASUAL TRAFFIC Site A Site B 21
  22. 22. BOTNET DETECTION 0 —> -1.0 0.0 1.0 Site C Time 22
  23. 23. BOTNET DETECTION 0 —> -1.0 0.0 1.0 Site C Time 23
  24. 24. BOTNET DETECTION 0 —> -1.0 0.0 1.0 Site C Time 24
  25. 25. BOTNET DETECTION 0 —> -1.0 0.0 1.0 Site C Time 25
  26. 26. BOTNET DETECTION 0 —> -1.0 0.0 1.0 Site C Time 26
  27. 27. BOTNET DETECTION 0 —> -1.0 0.0 1.0 Site C Time 27
  28. 28. BOTNET DETECTION ALERT PARAMETER Time Energy 28
  29. 29. BOTNET DETECTION ALERT PARAMETER Alert Threshold Time Energy 29
  30. 30. IDENTIFY PARTICIPANTS HOW DO WE FIND THOSE RESPONSIBLE? ▸ Part 2: Identify Participants ▸ From Detection Phase ▸ Times of Alerts ▸ Participating Countries ▸ Requires User Fingerprint ▸ ID Based on Various User Configuration Parameters 30
  31. 31. IDENTIFY PARTICIPANTS HOW DO WE FIND THOSE RESPONSIBLE? ISOLATE USERS IN COUNTRIES CHECK FOR MULTI- COUNTRY PRESENCE FIND COORDINATED USERS 31
  32. 32. IDENTIFY PARTICIPANTS Argentina - South AfricaIndonesia - Russian Federation 0.77 0.94 RequestCounts Time Threat Score 32 A1 A2 B1 B2
  33. 33. IDENTIFY PARTICIPANTS WHAT DOES THE FINAL OUTPUT LOOK LIKE? ID Threat Score 007E6ABE-A48C-3DE5-81E0-CBECBC2C96AB 0.82 07EF4DBE-EC0D-3BCE-A5BA-5910FF2457F5 0.97 0CCA9DA5-D63D-34E9-85A1-55154E5480E2 0.96 17C00FD8-E931-3789-AAC4-ED004C9143DB 0.90 22533F87-4B97-356A-95A4-84D5A8841F63 0.78 2E1C87C1-90BF-37BB-9A33-C482038AEE57 0.92 2F91B34E-AB15-389B-BCB6-8D913135D 0.95 3F6B5DF3-607E-3F1F-8050-2932B11D9E8A 0.94 46069A1E-F077-3F78-870A-C9BD7A0E1740 0.81 58A8DB25-2B99-3D2F-BA6D-50D1A8CFF3E9 0.77 58CBD814-CAC1-3644-8AB9-99A3C07A8E8F 0.70 6336DAC7-6508-3E79-9D99-37034A7C2E3F 0.83 655A6266-D316-360C-BAC1-76F26F3C0643 0.72 66C3A2B1-2953-3848-882C-591224C77E33 0.91
  34. 34. RECAP WHAT DID WE DO? DETECTED THE PRESENCE OF A BOTNET SCRUTINIZED USERS FROM PARTICIPATING COUNTRIES PRODUCED A LIST OF SUSPECT USERS
  35. 35. PERFORMANCE HOW DOES IT PERFORM? ▸ Prototype - Looks at Past Data ▸ Applied to an attack investigation ▸ 10 alerts over the month in question ▸ 100% of responsible users* ▸ Botnet Limited to Cross-Country ▸ Lacks Sub-Country insight * Deemed responsible by the customer 35
  36. 36. FUTURE WORK WHERE DO WE GO FROM HERE? ▸ Integrate into ML product ▸ Extract Features from Suspects ▸ Address Pitfalls ▸ Inefficiencies Due to Sparsity, Intra-country Activity ▸ 24/7 Streaming Process Across all customer sites ▸ Utilize New Tools ▸ Spark, storm, etc. ▸ Internal Platform 36
  37. 37. QUESTIONS? THANK YOU 37

×