Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Presentation Brucon - Anubisnetworks and PTCoresec

3,748 views

Published on

Published in: Technology, Business
  • ⇒⇒⇒WRITE-MY-PAPER.net ⇐⇐⇐ I love this site. It always finds me the best tutors in accordance with my needs. I have been using it since last year. The prices are not expensive compared to other sites. I am glad I discored this site:)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Presentation Brucon - Anubisnetworks and PTCoresec

  1. 1. Real time analysis and visualization ANUBISNETWORKS LABS PTCORESEC 1
  2. 2. Agenda  Who are we?  AnubisNetworks Stream  Stream Information Processing  Adding Valuable Information to Stream Events 2
  3. 3. Who are we?  Tiago Martins  AnubisNetworks  @Gank_101 3  João Gouveia  AnubisNetworks  @jgouv  Tiago Henriques  Centralway  @Balgan
  4. 4. Anubis StreamForce  Events (lots and lots of events)  Events are “volatile” by nature  They exist only if someone is listening  Remember?: “If a tree falls in a forest and no one is around to hear it, does it make a sound?” 4
  5. 5. Anubis StreamForce  Enter security Big Data “a brave new world” 5 Volume Variety Velocity We are here
  6. 6. Anubis StreamForce  Problems (and ambitions) to tackle  The huge amount and variety of data to process  Mechanisms to share data across multiple systems, organizations, teams, companies..  Common API for dealing with all this (both from a producer and a consumer perspective) 6
  7. 7. Anubis StreamForce  Enter the security events CEP - StreamForce High performance, scalable, Complex Event Processor (CEP) – 1 node (commodity hw) = 50k evt/second Uses streaming technology Follows a publish / subscriber model 7
  8. 8. Anubis StreamForce  Data format Events are published in JSON format Events are consumed in JSON format 8
  9. 9. Anubis StreamForce  Yes, we love JSON 9
  10. 10. Anubis StreamForce 10 Sharing Models
  11. 11. MFE OpenSource / MailSpike community Dashboard Dashboard Complex Event Processing Sinkholes Data-theft Trojans Real Time Feeds Real Time Feeds IP Reputation Passive DNSTraps / Honeypots Twitter
  12. 12. MFE OpenSource / MailSpike community Dashboard Dashboard Complex Event Processing Sinkholes Data-theft Trojans Real Time Feeds Real Time Feeds IP Reputation Passive DNSTraps / Honeypots Twitter
  13. 13. Anubis CyberFeed 13  Feed galore! Sinkhole data, traps, IP reputation, etc.  Bespoke feeds (create your own view)  Measure, group, correlate, de-duplicate ..  High volume (usually ~6,000 events per second, more data being added frequently
  14. 14. MFE OpenSource / MailSpike community Dashboard Event navigation Complex Event Processing Sinkholes Data-theft Trojans Real Time Feeds Real Time Feeds IP Reputation Passive DNSTraps / Honeypots Twitter
  15. 15. Anubis CyberFeed 15  Apps (demo time)
  16. 16. Stream Information Processing  Collecting events from the Stream.  Generating reports.  Real time visualization. 16
  17. 17. Challenge  ~6k events/s and at peak over 10k events/s.  Let‟s focus on trojans feed (banktrojan).  Peaks @ ~4k events/s {"_origin":"banktrojan","env":{"server_name":"anam0rph.su","remote_ad dr":"46.247.141.66","path_info":"/in.php","request_method":"POST","http _user_agent":"Mozilla/4.0"},"data":"upqchCg4slzHEexq0JyNLlaDqX40G sCoA3Out1Ah3HaVsQj45YCqGKylXf2Pv81M9JX0","seen":1379956636,"tr ojanfamily":"Zeus","_provider":"lab","hostn":"lab14","_ts":1379956641} 17
  18. 18. Challenge 18
  19. 19. Challenge 19
  20. 20. Challenge  Let‟s use the Stream to help  Group by machine and trojan  From peak ~4k/s to peak ~1k/s  Filter fields.  Geo location  We end up with {"env":{"remote_addr":"207.215.48.83"},"trojanfamily":"W32Expiro","_geo_env_remote_addr ":{"country_code":"US","country_name":"United States","city":"Los Angeles","latitude":34.0067,"longitude":-118.3455,"asn":7132,"asn_name":"AS for SBIS-AS"}} 20
  21. 21. Challenge  How to process and store these events? 21
  22. 22. Technologies 22  Applications  NodeJS  Server-side Javascript Platform.  V8 Javascript Engine.  http://nodejs.org/ Why?  Great for prototyping.  Fast and scalable.  Modules for (almost) everything.
  23. 23. Technologies 23  Databases  MongoDB  NoSQL Database.  Stores JSON-style documents.  GridFS  http://www.mongodb.org/ Why?  JSON from the Stream, JSON in the database.  Fast and scalable.  Redis  Key-value storage.  In-memory dataset.  http://redis.io/ Why?  Faster than MongoDB for certain operations, like keeping track of number of infected machines.  Very fast and scalable.
  24. 24. Data Collection 24 Storage Aggregate information MongoDB Redis Worker Worker Worker Processor Process real time events  Applications  Collector  Worker  Processor  Databases  MongoDB  Redis Collector Stream
  25. 25. Data Collection 25 Storage Aggregate information MongoDB Redis Worker Worker Worker Processor Process real time events  Events comes from the Stream.  Collector distributes events to Workers.  Workers persist event information.  Processor aggregates information and stores it for statistical and historical analysis. Collector Stream
  26. 26. Data Collection 26 Storage Aggregate information MongoDB Redis Worker Worker Worker Processor Process real time events  MongoDB  Real time information of infected machines.  Historical aggregated information.  Redis  Real time counters of infected machines. Collector Stream
  27. 27. Data Collection - Collector 27 Collector  Old data is periodically remove, i.e. machines that don‟t produce events for more than 24 hours.  Send events to Workers.Workers  Decrements counters of removed information.  Send warnings  Country / ASN is no longer infected.  Botnet X decreased Y % of its size.
  28. 28. Data Collection - Worker 28 Worker  Create new entries for unseen machines.  Adds information about new trojans / domains.  Update the last time the machine was seen.  Process events and update the Redis counters accordingly.  Needs to check MongoDB to determine if:  New entry – All counters incremented  Existing entry – Increment only the counters related to that Trojan  Send warnings  Botnet X increased Y % in its size.  New infections seen on Country / ASN.
  29. 29. Data Collection - Processor Processor 29  Processor retrieves real time counters from Redis.  Information is processed by:  Botnet;  ASN;  Country;  Botnet/Country;  Botnet/ASN/Country;  Total.  Persisting information to MongoDB creates a historic database of counters that can be queried and analyzed.
  30. 30. Data Collection - MongoDB  Collection for active machines in the last 24h { "city" : "Philippine", "country" : "PH", "region" : "N/A", "geo" : { "lat" : 16.4499, "lng" : 120.5499 }, "created" : ISODate("2013-09-21T00:19:12.227Z "), "domains" : [ { "domain" : "hzmksreiuojy.nl", "trojan" : "zeus", "last" : ISODate("2013-09-21T09:42:56.799Z"), "created" : ISODate("2013-09-21T00:19:12.227Z") } ], "host" : "112.202.37.72.pldt.net", "ip" : "112.202.37.72", "ip_numeric" : 1892296008, "asn" : "Philippine Long Distance Telephone Company", "asn_code" : 9299, "last" : ISODate("2013-09-21T09:42:56.799Z"), "trojan" : [ "zeus” ] } 30
  31. 31. Data Collection - MongoDB  Collection for aggregated information (the historic counters database) { "_id" : ObjectId("519c0abac1172e813c004ac3"), "0" : 744, "1" : 745, "3" : 748, "4" : 748, "5" : 746, "6" : 745, ... "10" : 745, "11" : 742, "12" : 746, "13" : 750, "14" : 753, ... "metadata" : { "country" : "CH", "date" : "2013-05-22T00:00:00+0000", "trojan" : "conficker_b", "type" : "daily" } } 31 Preallocated entries for each hour when the document is created. If we don’t, MongoDB will keep extending the documents by adding thousands of entries every hour and it becomes very slow.
  32. 32. Data Collection - MongoDB  Collection for 24 hours  4 MongoDB Shard instances  >3 Million infected machines  ~2 Gb of data  ~558 bytes per document.  Indexes by  ip – helps inserts and updates.  ip_numeric – enables queries by CIDRs.  last – Faster removes for expired machines.  host – Hmm, is there any .gov?   country, family, asn – Speeds MongoDB queries and also allows faster custom queries.  Collection for aggregated information  Data for 119 days (25 May to 11 July)  > 18 Million entries  ~6,5 Gb of data  ~366 bytes per object  ~56 Mb per day  Indexes by  metadata.country  metadata.trojan  metadata.date  Metadata.asn  Metadata.type, metadata.country,metadata.date,met....... (all) 32
  33. 33. Data Collection - Redis  Counters by Trojan / Country "cutwailbt:RO": "1256", "rbot:LA": "3", "tdss:NP": "114", "unknown4adapt:IR": "100", "unknownaff:EE": "0", "cutwail:CM": "20", "unknownhrat3:NZ": "56", "cutwailbt:PR": "191", "shylock:NO": "1", "unknownpws:BO": "3", "unknowndgaxx:CY": "77", "fbhijack:GH": "22", "pushbot:IE": "2", "carufax:US": "424“  Counters by Trojan "unknownwindcrat": "18", "tdss": "79530", "unknownsu2": "2735", "unknowndga9": "15", "unknowndga3": "17", "ircbot": "19874", "jshijack": "35570", "adware": "294341", "zeus": "1032890", "jadtre": "40557", "w32almanahe": "13435", "festi": "1412", "qakbot": "19907", "cutwailbt": "38308“  Counters by Country “BY": "11158", "NA": "314", "BW": "326", "AS": "35", "AG": "94", "GG": "43", "ID": "142648", "MQ": "194", "IQ": "16142", "TH": "105429", "MY": "35410", "MA": "15278", "BG": "15086", "PL": "27384” 33
  34. 34. Data Collection - Redis  Redis performance in our machine  SET: 473036.88 requests per second  GET: 456412.59 requests per second  INCR: 461787.12 requests per second  Time to get real time data  Getting all the data from Familys/ASN/Counters to the NodeJS application and ready to be processed in around half a second  > 120 000 entries in… (very fast..)  Our current usage is  ~ 3% CPU (of a 2.0 Ghz core)  ~ 480 Mb of RAM 34
  35. 35. Data Collection - API  But! There is one more application..  How to easily retrieve stored data  MongoDB Rest API is a bit limited.  NodeJS HTTP + MongoDB + Redis  Redis  http://<host>/counters_countries  ...  MongoDB  http://<host>/family_country  ...  Custom MongoDB Querys  http://<host>/ips?f.ip_numeric=95.68.149.0/22  http://<host>/ips?f.country=PT  http://<host>/ips?f.host=bgovb 35
  36. 36. Data Collection - Limitations  Grouping information by machine and trojan doesn‟t allow to study the real number of events per machine.  Can be useful to get an idea of the botnet operations or how many machines are behind a single IP (everyone is behind a router).  Slow MongoDB impacts everything  Worker application needs to tolerate a slow MongoDB and discard some information has a last resort.  Beware of slow disks! Data persistence occurs every 60 seconds (default) and can take too much time, having a real impact on performance..  >10s to persist is usually very bad, something is wrong with hard drives.. 36
  37. 37. Data Collection - Evolution  Warnings  Which warnings to send? When? Thresholds?  Aggregate data by week, month, year.  Aggregate information in shorter intervals.  Data Mining algorithms applied to all the collected information.  Apply same principles to other feeds of the Stream.  Spam  Twitter  Etc.. 37
  38. 38. Reports  What‟s happening in country X?  What about network 192.168.0.1/24?  Can send me the report of Y everyday at 7 am?  Ohh!! Remember the report I asked last week?  Can I get a report for ASN AnubisNetwork? 38
  39. 39. Reports 39  HTTP API  Schedule  Get  Edit  Delete  List schedules  List reports  Check MongoDB for work.  Generate CSV report or store the JSON Document for later querying.  Send email with link to files when report is ready. Server Generator
  40. 40. Reports – MongoDB CSVs  Scheduled Report { "__v" : 0, "_id" : ObjectId("51d64e6d5e8fd0d145000008"), "active" : true, "asn_code" : "", "country" : "PT", "desc" : "Portugal Trojans", "emails" : "", "range" : "", "repeat" : true, "reports" : [ ObjectId("51d64e7037571bd24500000d"), ObjectId("51d741e8bcb161366600000c"), ObjectId("51d89367bcb161366600005f"), ObjectId("51d9e4f9bcb16136660000ca"), ObjectId("51db3678c3a15fc577000038"), ObjectId("51dc87e216eea97c20000007"), ObjectId("51ddd964a89164643b000001") ], "run_at" : ISODate("2013-07-11T22:00:00Z"), "scheduled_date" : ISODate("2013-07- 05T04:41:17.067Z") }  Report { "__v" : 0, "_id" : ObjectId("51d89367bcb161366600005f"), "date" : ISODate("2013-07-06T22:00:07.015Z"), "files" : [ ObjectId("51d89368bcb1613666000060") ], "work" : ObjectId("51d64e6d5e8fd0d145000008") }  Files  Each report has an array of files that represents the report.  Each file is stored in GridFS. 40
  41. 41. Reports – MongoDB JSONs  Scheduled Report { "__v" : 0, "_id" : ObjectId("51d64e6d5e8fd0d145000008"), "active" : true, "asn_code" : "", "country" : "PT", "desc" : "Portugal Trojans", "emails" : "", "range" : "", "repeat" : true, “snapshots" : [ ObjectId("521f761c0a45c3b00b000001"), ObjectId("521fb0848275044d420d392f"), ObjectId("52207c2f7c53a8494f010afa"), ObjectId("5221c9df4910ba3874000001"), ObjectId("522275724910ba3874001f66"), ObjectId("5223c6f24910ba3874003b7a"), ObjectId("522518734910ba3874005763") ], "run_at" : ISODate("2013-07-11T22:00:00Z"), "scheduled_date" : ISODate("2013-07-05T04:41:17.067Z") }  Snapshot { "_id" : ObjectId("51d89367bcb161366600005f"), "date" : ISODate("2013-07-06T22:00:07.015Z"), "work" : ObjectId("521f761c0a45c3b00b000001"), count: 123 }  Results { "machine" : { "trojan" : [ “conficker_b“ ], "ip" : "2.80.2.53", "host" : "Bl19-1-13.dsl.telepac.pt", }, … , "metadata" : { "work" : ObjectId("521f837647b8d3ba7d000001"), "snaptshot" : ObjectId("521f837aa669d0b87d000001"), "date" : ISODate("2013-08-29T00:00:00Z") }, } 41
  42. 42. Reports – Evolution  Other reports formats.  Charts?  Other type of reports. (Not only botnets).  Need to evolve Collector first. 42
  43. 43. Globe  How to visualize real time events from the stream?  Where are the botnets located?  Who‟s the most infected?  How many infections? 43
  44. 44. Globe – Stream  origin = banktrojan  Modules  Group  trojanfamily  _geo_env_remote_addr.country_n ame  grouptime=5000  Geo  Filter fields  trojanfamily  Geolocation  _geo_env_remote_addr.l*  KPI  trojanfamily  _geo_env_remote_addr.country_n ame  kpilimit = 10 44 Stream NodeJS Browser  Request botnets from stream
  45. 45. Globe – NodeJS 45 Stream NodeJS Browser  NodeJS  HTTP  Get JSON from Stream.  Socket.IO  Multiple protocol support (to bypass some proxys and handle old browsers).  Redis  Get real time number of infected machines.
  46. 46. Globe – Browser 46 Stream NodeJS Browser  Browser  Socket.IO Client  Real time apps.  Websockets and other types of transport.  WebGL  ThreeJS  Tween  jQuery  WebWorkers  Runs in the background.  Where to place the red dots?  Calculations from geolocation to 3D point goes here.
  47. 47. Globe – Evolution  Some kind of HUD to get better interaction and notifications.  Request actions by clicking in the globe.  Generate report of infected in that area.  Request operations in a specific that area.  Real time warnings  New Infections  Other types of warnings... 47
  48. 48. Adding Valuable Information to Stream Events  How to distribute workload to other machines?  Adding value to the information we already have. 48
  49. 49. Minions  Typically the operations that would had value are expensive in terms of resources  CPU  Bandwidth  Master-slave approach that distributes work among distributed slaves we called Minions. 49 Master Minion Minion Minion Minion
  50. 50. Minions 50  Master receives work from Requesters and store the work in MongoDB.  Minions request work.  Requesters receive real time information on the work from the Master or they can ask for work information at a later time. Process / Storage Minions Master MongoDB DNS Scan Minion Minion Requesters Minion
  51. 51. Minions  Master has an API that allows custom Requesters to ask for work and monitor the work.  Minion have a modular architecture  Easily create a custom module.  Information received from the Minions can then be processed by the Requesters and  Sent to the Stream  Saved on the database  Update existing database 51 Minion DNS Scanning Data Mining
  52. 52. Extras...  So what else could we possibly do using the Stream?  Distributed Portscanning  Distributed DNS Resolutions  Transmit images  Transmit videos  Realtime tools  Data agnostic. Throw stuff at it and it will deal with it. 52
  53. 53. Extras...  So what else could we possibly do using the Stream?  Distributed Portscanning  Distributed DNS Resolutions  Transmit images  Transmit videos  Realtime tools  Data agnostic. Throw stuff at it and it will deal with it. 53 FOCUS FOCUS
  54. 54. Portscanning  Portscanning done right…  Its not only about your portscanner being able to throw 1 billion packets per second.  Location = reliability of scans.  Distributed system for portscanning is much better. But its not just about having it distributed. Its about optimizing what it scans. 54
  55. 55. Portscanning 55
  56. 56. Portscanning 56
  57. 57. Portscanning 57
  58. 58. Portscanning IP Australia (intervolve) China (ChinaVPShosting) Russia (NQHost) USA (Ramnode) Portugal (Zon PT) 41.63.160.0/19 (Angola) 0 hosts up 0 hosts up 0 hosts up 0 hosts up 3 hosts up (sometimes) 5.1.96.0/21 (China) 10 hosts up 70 hosts up 40 hosts up 10 hosts up 40 hosts up 41.78.72.0/22 (Somalia) 0 hosts up 0 hosts up 0 hosts up 0 hosts up 33 hosts up 92.102.229.0/24 (Russia) 20 hosts up 100 hosts up 2 hosts up 2 hosts up 150 hosts up 58
  59. 59. Portscanning problems...  Doing portscanning correctly brings along certain problems.  If you are not HD Moore or Dan Kaminsky, resource wise you are gonna have a bad time 59
  60. 60. Portscanning problems...  Doing portscanning correctly brings along certain problems.  If you are not HD Moore or Dan Kaminsky, resource wise you are gonna have a bad time 60
  61. 61. Portscanning problems...  Doing portscanning correctly brings along certain problems.  If you are not HD Moore or Dan Kaminsky, resource wise you are gonna have a bad time  You need lots of minions in different parts of the world  Doesn‟t actually require an amazing CPU or RAM if you do it correctly.  Storing all that data...  Querying that data... Is it possible to have a cheap, distributed portscanning system? 61
  62. 62. Portscanning problems... 62 Minion
  63. 63. Portscanning 63
  64. 64. Data…. 64
  65. 65. Data 65
  66. 66. Internet status... 66
  67. 67. Internet status... 67
  68. 68. If we„re doing it... Anyone else can. Evil side? 68
  69. 69. Anubis StreamForce  Have cool ideas? Contact us  Access for Brucon participants: API Endpoint: http://brucon.cyberfeed.net:8080/stream?key=brucon 2013  Web UI Dashboard maker: http://brucon.cyberfeed.net:8080/webgui 69
  70. 70. Lol  Last minute testing 70
  71. 71. Questions? 71

×