• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Presentation Brucon - Anubisnetworks and PTCoresec
 

Presentation Brucon - Anubisnetworks and PTCoresec

on

  • 2,312 views

 

Statistics

Views

Total Views
2,312
Views on SlideShare
1,765
Embed Views
547

Actions

Likes
2
Downloads
16
Comments
0

16 Embeds 547

http://ptcoresec.eu 251
http://engine.centralway.com 239
https://twitter.com 14
http://cloud.feedly.com 12
http://localhost 6
http://127.0.0.1 5
http://smashingreader.com 4
http://feedly.com 4
http://www.ptcoresec.eu 3
http://www.balgan.eu 2
http://www.inoreader.com 2
http://forum.balgan.eu 1
http://wiki.balgan.eu 1
http://tweetedtimes.com 1
http://translate.googleusercontent.com 1
http://www.infosec.pt 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Internet scale. Devices, systems, firewalls, ids..
  • Internet scale. Devices, systems, firewalls, ids..
  • Internet scale. Devices, systems, firewalls, ids..
  • Internet scale. Devices, systems, firewalls, ids..
  • Internet scale. Devices, systems, firewalls, ids..
  • Internet scale. Devices, systems, firewalls, ids..
  • Internet scale. Devices, systems, firewalls, ids..
  • Internet scale. Devices, systems, firewalls, ids..
  • Hi, I’m going to present the next section of the presentation.So, how can we collect events from the Stream? What information can we gather from those events?How can we access to those events in real time?
  • The challenge here is the large number of events per second, on total we currently have over 6000 events per second, 4000 of these events are from a single feed called banktrojans, which is basically formed by infected machines. This is what an event from that machines looks like.
  • So, basicallythiswhatwesee..
  • And, thisiswhatwewant.Wewant to knowwhereour targets are, where to look.
  • Infected machines are usually noisy and they tend to produce a big number of events. We can use the stream to help us, the group module groups the events that occur within 4 minutes of each other and originate from the same machine and trojan, we can go from 4000 to 1000 events per second, basically we receive an event for a machine and trojan and the next events will not be received because they are considered duplicates. Then we have the filter module to filter the fields we need, for example, we only care about the IP address, ASN, Trojan, C&C domain and geo location of the machine.How do we process and store these 1000 events per second?
  • Infected machines are usually noisy and they tend to produce a big number of events. We can use the stream to help us, the group module groups the events that occur within 4 minutes of each other and originate from the same machine and trojan, we can go from 4000 to 1000 events per second, basically we receive an event for a machine and trojan and the next events will not be received because they are considered duplicates. Then we have the filter module to filter the fields we need, for example, we only care about the IP address, ASN, Trojan, C&C domain and geo location of the machine.How do we process and store these 1000 events per second?
  • First, some technical information about the technologies we use.For applications development, we NodeJS, a server-side javascript platform built on top of the V8 engine. It’s fast, scalable and has modules for almost everything.For data storage, MongoDB is a NoSQL database that is fast and scalable. It can also store JSON-style documents and files in GridFS.And then we have Redis, key-value storage that is very fast and also scalable.
  • First, some technical information about the technologies we use.For applications development, we NodeJS, a server-side javascript platform built on top of the V8 engine. It’s fast, scalable and has modules for almost everything.For data storage, MongoDB is a NoSQL database that is fast and scalable. It can also store JSON-style documents and files in GridFS.And then we have Redis, key-value storage that is very fast and also scalable.
  • This is an overview of the Data Collection. We built 3 applications: Collector; Worker; Processor.We have the events coming from the Stream to the Collector. The Collector then distributes the workload to workers that process and store the information in MongoDB and Redis.The Processor will then gather information from MongoDB and stores it for statistical and historical analysis.
  • Events come from the Stream to the Collector. The Collector then distributes the workload to workers that process and store the information in MongoDB and Redis.The Processor will then gather information from Redis and stores it in MongoDB for statistical and historical analysis.
  • Events come from the Stream to the Collector. The Collector then distributes the workload to workers that process and store the information in MongoDB and Redis.The Processor will then gather information from Redis and stores it in MongoDB for statistical and historical analysis.
  • So the Collector, talks to these 3 components. It maintains the information on MongoDB, removing information about machines that don’t produce events for more than 24 hours.Decrements counters for Redis, and while maintaining this information, it is possible to send warnings.Workers receive events from the Collector and can run in any machine with connection to the collector and database..
  • The Worker processes and stores the event in MongoDB, creating new entries or updating information about new trojans in existing entry. It also updates the last time we saw an event for that machine.While updating MongoDB the Worker also needs to maintain the Redis counters information, incrementing the values for new entries or updating counters for a new trojan in a seen machine. While performing this task it can also understand if there is a warning to be sent.
  • The last component is Processor. It retrieves real time counters from Redis, processes and stores them in MongoDB aggregated by Botnet, ASN, Country, etc. This information can then be analysed and queried.
  • Let’s now check the Databases. MongoDB collection that stores information of active machines in the last 24 hours, looks like this. It’s a JSON document with information about geolocation, IP address, Trojans, last time seen, etc. There is also a numerical representation of the IP Address that helps to query for specific network ranges.
  • The aggregated information collection holds documents with this format. The metadata field that holds information about the specific document, its type and origin of information. In this case its country and trojan. It has an entry per hour with the number of infections, these entries need to be preallocated with zeros, so at every day a new document is created for a specific metadata with all the hours at 0. If we don’t do this there will be a lot of extends of documents on MongoDB and I will become very slow.
  • Some more information for this collections. The 24 hours collection is sharded between 4 MongoDB instance and in July it had information over 3 million infected machines, that only takes 2 Gb of disk to store. The aggregated information collected for 119 days, had over 18 million entries and occupied around 6,5 Gb of data, that’s around 56Mb per day.These were the indexes created. We need to be very careful with these because they speed the readings but they slow the writings. We want fast writes for the 24 hours collection and for that reason we need to keep the indexes optimized and only the IP index runs on the foreground, all the other run on the background.For the aggregated information collection we don’t need to be very careful, we can add the indexes that will allow us to perform faster queries.
  • Let’s look at the Redis information. The counters look like this, they are concatenation of string separated by colons, for example (example).
  • Redis is very fast, we can retrieve all the information from the biggest in around half a second. The insertion of data is also very fast while using very few resources of a machine.
  • There was also the need to access all this information on demand, so an API was created that allow to retrieve or query information on both Redis and MongoDB.
  • So, there are a couple of limitations with these approaches. By grouping events in order to reduce the amount of events per second we are discarding information that could be studied in order to better understand what is behind those machines, for example, the number of events of a machine with a specific botnet could indicate how many machines are on that network (everyone has a router nowadays).Also MongoDB can impact everything, it is fast but needs to be used carefully. We need 3 MongoDB shards to keep the performance on acceptable levels. If we start getting 2 or 3 times the events we currently have the Workers won’t be able to persist all that information in time and will have to start discarding it at some point. The alternative to discard is to add more shards. You need to constantly monitor your hard drives, if the performance decreases, bad things will happen. Mongo won’t be able to persist the information in time and will start to slow down everything.
  • How can we evolve this solution?We can send more warnings with the information we have, but when? What thresholds should we use.We only aggregate information by the hours and day, what about weeks, months, years? What about shorter intervals?We can also apply data mining algorithms in order to retrieve more information from the data we already collect.And of course, apply these principals to other feeds like Spam or Twitter.
  • So how do we extract information about a specific network or country? What about what happened last week?
  • Of course we used NodeJS and built 2 applications, one that is used as an API to access and request reports and the other that checks the Database for requests, generates the reports and stores them. The reports are saved in CSV or in JSON format, for later query. They are also sent by email where we give a URL to download the files.
  • The collections that hold the CSV reports look like this. The have a scheduled work collection that keeps an record of the report its generating and the reports it already generated. The reports keep an array of files generated and saved on MongoDB storage for files, called GridFS.
  • Then we have the JSONs reports, that we call snapshots. The main differences are the count field in the snapshot that holds the number of infected machines in that snapshot, and the results for that snapshot which include the information about the machine and the metadata that identifies the origin of that entry.We could store an array of results in the Snapshot collection but it would be hard to use it because it would have too many entries, possibly millions and would just be useless.
  • How could we evolve the Reports? We can store reports in other formats, generate charts for that report with specific information and start storing other type of reports, not just for botnets.
  • So, how can we visualize realtime events? Let’s focus on the botnets again, it would be awesome if we could see the distribution of botnets thru the world, receive warnings and monitor other information in realtime.For that purpose, there is a shiny globe (demo).We can see in realtime when infected machines produce events, monitor a top of most infected with a specific Trojan, number of events being generated every second and a total number of infections. countries
  • This information comes from the Steam, we group it by Trojan and country, we don’t really want to sent ALL the events to the browser because some browsers would just crash.. For that reason we also filter only the geolocation and trojan family. The information about the top infected come from a KPI module that dynamically calculates the top in the stream.
  • Between the Stream and the Browser we have a NodeJS application that controls the flow of events to the browser, discarding if too many events are received and relaying the information to the Browser using the socket.io module. We all need to get the total number of infected machines from the Redis counters.
  • At the browser end we use the socket.io client to receive the events, process those events using WebWorkers (calculation of where to place the dots) and render everything using WebGL.
  • We can evolve the globe to create a more interactive experience where we could perform actions in realtime through the globe.We can also show warnings in the globe, for example, about new infections.
  • How can we add valuable information to the information we already have?
  • Typically the operations that would had value are expensive, they need CPU and bandwidth. So we needed a master-slave approach that distributes the work among multiple slaves, we called Minions.
  • Masters receive work from the Requesters and store that work on MongoDB. Minions will then request work and send the work result to the Master. Master then send updates directly to the Requester of the work and also stores the results in MongoDB.
  • The Master has an API that allows for custom Requesters to ask for work and monitor the work results received from the Minion.The Minion application was built with a modular architecture in mind, so it is very easy to create a custom module.Information received by the minion can then be injected on the stream or stored in a database.
  • Getting full picture from an infected machine or a networking involves lots of steps:Sinkholing that botnetPortscanning target gives u an idea if the machine is connected directly to internet or behind gateway or if there are shares available, how could this machine possibly been compromised (ms08-067 ? )DNS analysis
  • We are going to focus on:PortscanningDNS resolutionsRealtime demos
  • Its really cool to have a super fast scanner in a lab giving 1 quadrillion packets per second. However this is the wrong way. Correct way:Slow scanGeo DistributedScanning angola from australia = 60% of services timeout and look closedScanning USA from Russia or vice versa = retarded
  • Combining a model B raspberry pi with the distro pwnpi and a custom set of scripts makes it a Minion. A cheap device that we can use to do distributed scanning and even ask others to deploy and contribute to our system.In the near future we intend to make this image available for others that want to contribute to our system.

Presentation Brucon - Anubisnetworks and PTCoresec Presentation Brucon - Anubisnetworks and PTCoresec Presentation Transcript

  • Real time analysis and visualization ANUBISNETWORKS LABS PTCORESEC 1
  • Agenda  Who are we?  AnubisNetworks Stream  Stream Information Processing  Adding Valuable Information to Stream Events 2
  • Who are we?  Tiago Martins  AnubisNetworks  @Gank_101 3  João Gouveia  AnubisNetworks  @jgouv  Tiago Henriques  Centralway  @Balgan
  • Anubis StreamForce  Events (lots and lots of events)  Events are “volatile” by nature  They exist only if someone is listening  Remember?: “If a tree falls in a forest and no one is around to hear it, does it make a sound?” 4
  • Anubis StreamForce  Enter security Big Data “a brave new world” 5 Volume Variety Velocity We are here
  • Anubis StreamForce  Problems (and ambitions) to tackle  The huge amount and variety of data to process  Mechanisms to share data across multiple systems, organizations, teams, companies..  Common API for dealing with all this (both from a producer and a consumer perspective) 6
  • Anubis StreamForce  Enter the security events CEP - StreamForce High performance, scalable, Complex Event Processor (CEP) – 1 node (commodity hw) = 50k evt/second Uses streaming technology Follows a publish / subscriber model 7
  • Anubis StreamForce  Data format Events are published in JSON format Events are consumed in JSON format 8
  • Anubis StreamForce  Yes, we love JSON 9
  • Anubis StreamForce 10 Sharing Models
  • MFE OpenSource / MailSpike community Dashboard Dashboard Complex Event Processing Sinkholes Data-theft Trojans Real Time Feeds Real Time Feeds IP Reputation Passive DNSTraps / Honeypots Twitter
  • MFE OpenSource / MailSpike community Dashboard Dashboard Complex Event Processing Sinkholes Data-theft Trojans Real Time Feeds Real Time Feeds IP Reputation Passive DNSTraps / Honeypots Twitter
  • Anubis CyberFeed 13  Feed galore! Sinkhole data, traps, IP reputation, etc.  Bespoke feeds (create your own view)  Measure, group, correlate, de-duplicate ..  High volume (usually ~6,000 events per second, more data being added frequently
  • MFE OpenSource / MailSpike community Dashboard Event navigation Complex Event Processing Sinkholes Data-theft Trojans Real Time Feeds Real Time Feeds IP Reputation Passive DNSTraps / Honeypots Twitter
  • Anubis CyberFeed 15  Apps (demo time)
  • Stream Information Processing  Collecting events from the Stream.  Generating reports.  Real time visualization. 16
  • Challenge  ~6k events/s and at peak over 10k events/s.  Let‟s focus on trojans feed (banktrojan).  Peaks @ ~4k events/s {"_origin":"banktrojan","env":{"server_name":"anam0rph.su","remote_ad dr":"46.247.141.66","path_info":"/in.php","request_method":"POST","http _user_agent":"Mozilla/4.0"},"data":"upqchCg4slzHEexq0JyNLlaDqX40G sCoA3Out1Ah3HaVsQj45YCqGKylXf2Pv81M9JX0","seen":1379956636,"tr ojanfamily":"Zeus","_provider":"lab","hostn":"lab14","_ts":1379956641} 17
  • Challenge 18
  • Challenge 19
  • Challenge  Let‟s use the Stream to help  Group by machine and trojan  From peak ~4k/s to peak ~1k/s  Filter fields.  Geo location  We end up with {"env":{"remote_addr":"207.215.48.83"},"trojanfamily":"W32Expiro","_geo_env_remote_addr ":{"country_code":"US","country_name":"United States","city":"Los Angeles","latitude":34.0067,"longitude":-118.3455,"asn":7132,"asn_name":"AS for SBIS-AS"}} 20
  • Challenge  How to process and store these events? 21
  • Technologies 22  Applications  NodeJS  Server-side Javascript Platform.  V8 Javascript Engine.  http://nodejs.org/ Why?  Great for prototyping.  Fast and scalable.  Modules for (almost) everything.
  • Technologies 23  Databases  MongoDB  NoSQL Database.  Stores JSON-style documents.  GridFS  http://www.mongodb.org/ Why?  JSON from the Stream, JSON in the database.  Fast and scalable.  Redis  Key-value storage.  In-memory dataset.  http://redis.io/ Why?  Faster than MongoDB for certain operations, like keeping track of number of infected machines.  Very fast and scalable.
  • Data Collection 24 Storage Aggregate information MongoDB Redis Worker Worker Worker Processor Process real time events  Applications  Collector  Worker  Processor  Databases  MongoDB  Redis Collector Stream
  • Data Collection 25 Storage Aggregate information MongoDB Redis Worker Worker Worker Processor Process real time events  Events comes from the Stream.  Collector distributes events to Workers.  Workers persist event information.  Processor aggregates information and stores it for statistical and historical analysis. Collector Stream
  • Data Collection 26 Storage Aggregate information MongoDB Redis Worker Worker Worker Processor Process real time events  MongoDB  Real time information of infected machines.  Historical aggregated information.  Redis  Real time counters of infected machines. Collector Stream
  • Data Collection - Collector 27 Collector  Old data is periodically remove, i.e. machines that don‟t produce events for more than 24 hours.  Send events to Workers.Workers  Decrements counters of removed information.  Send warnings  Country / ASN is no longer infected.  Botnet X decreased Y % of its size.
  • Data Collection - Worker 28 Worker  Create new entries for unseen machines.  Adds information about new trojans / domains.  Update the last time the machine was seen.  Process events and update the Redis counters accordingly.  Needs to check MongoDB to determine if:  New entry – All counters incremented  Existing entry – Increment only the counters related to that Trojan  Send warnings  Botnet X increased Y % in its size.  New infections seen on Country / ASN.
  • Data Collection - Processor Processor 29  Processor retrieves real time counters from Redis.  Information is processed by:  Botnet;  ASN;  Country;  Botnet/Country;  Botnet/ASN/Country;  Total.  Persisting information to MongoDB creates a historic database of counters that can be queried and analyzed.
  • Data Collection - MongoDB  Collection for active machines in the last 24h { "city" : "Philippine", "country" : "PH", "region" : "N/A", "geo" : { "lat" : 16.4499, "lng" : 120.5499 }, "created" : ISODate("2013-09-21T00:19:12.227Z "), "domains" : [ { "domain" : "hzmksreiuojy.nl", "trojan" : "zeus", "last" : ISODate("2013-09-21T09:42:56.799Z"), "created" : ISODate("2013-09-21T00:19:12.227Z") } ], "host" : "112.202.37.72.pldt.net", "ip" : "112.202.37.72", "ip_numeric" : 1892296008, "asn" : "Philippine Long Distance Telephone Company", "asn_code" : 9299, "last" : ISODate("2013-09-21T09:42:56.799Z"), "trojan" : [ "zeus” ] } 30
  • Data Collection - MongoDB  Collection for aggregated information (the historic counters database) { "_id" : ObjectId("519c0abac1172e813c004ac3"), "0" : 744, "1" : 745, "3" : 748, "4" : 748, "5" : 746, "6" : 745, ... "10" : 745, "11" : 742, "12" : 746, "13" : 750, "14" : 753, ... "metadata" : { "country" : "CH", "date" : "2013-05-22T00:00:00+0000", "trojan" : "conficker_b", "type" : "daily" } } 31 Preallocated entries for each hour when the document is created. If we don’t, MongoDB will keep extending the documents by adding thousands of entries every hour and it becomes very slow.
  • Data Collection - MongoDB  Collection for 24 hours  4 MongoDB Shard instances  >3 Million infected machines  ~2 Gb of data  ~558 bytes per document.  Indexes by  ip – helps inserts and updates.  ip_numeric – enables queries by CIDRs.  last – Faster removes for expired machines.  host – Hmm, is there any .gov?   country, family, asn – Speeds MongoDB queries and also allows faster custom queries.  Collection for aggregated information  Data for 119 days (25 May to 11 July)  > 18 Million entries  ~6,5 Gb of data  ~366 bytes per object  ~56 Mb per day  Indexes by  metadata.country  metadata.trojan  metadata.date  Metadata.asn  Metadata.type, metadata.country,metadata.date,met....... (all) 32
  • Data Collection - Redis  Counters by Trojan / Country "cutwailbt:RO": "1256", "rbot:LA": "3", "tdss:NP": "114", "unknown4adapt:IR": "100", "unknownaff:EE": "0", "cutwail:CM": "20", "unknownhrat3:NZ": "56", "cutwailbt:PR": "191", "shylock:NO": "1", "unknownpws:BO": "3", "unknowndgaxx:CY": "77", "fbhijack:GH": "22", "pushbot:IE": "2", "carufax:US": "424“  Counters by Trojan "unknownwindcrat": "18", "tdss": "79530", "unknownsu2": "2735", "unknowndga9": "15", "unknowndga3": "17", "ircbot": "19874", "jshijack": "35570", "adware": "294341", "zeus": "1032890", "jadtre": "40557", "w32almanahe": "13435", "festi": "1412", "qakbot": "19907", "cutwailbt": "38308“  Counters by Country “BY": "11158", "NA": "314", "BW": "326", "AS": "35", "AG": "94", "GG": "43", "ID": "142648", "MQ": "194", "IQ": "16142", "TH": "105429", "MY": "35410", "MA": "15278", "BG": "15086", "PL": "27384” 33
  • Data Collection - Redis  Redis performance in our machine  SET: 473036.88 requests per second  GET: 456412.59 requests per second  INCR: 461787.12 requests per second  Time to get real time data  Getting all the data from Familys/ASN/Counters to the NodeJS application and ready to be processed in around half a second  > 120 000 entries in… (very fast..)  Our current usage is  ~ 3% CPU (of a 2.0 Ghz core)  ~ 480 Mb of RAM 34
  • Data Collection - API  But! There is one more application..  How to easily retrieve stored data  MongoDB Rest API is a bit limited.  NodeJS HTTP + MongoDB + Redis  Redis  http://<host>/counters_countries  ...  MongoDB  http://<host>/family_country  ...  Custom MongoDB Querys  http://<host>/ips?f.ip_numeric=95.68.149.0/22  http://<host>/ips?f.country=PT  http://<host>/ips?f.host=bgovb 35
  • Data Collection - Limitations  Grouping information by machine and trojan doesn‟t allow to study the real number of events per machine.  Can be useful to get an idea of the botnet operations or how many machines are behind a single IP (everyone is behind a router).  Slow MongoDB impacts everything  Worker application needs to tolerate a slow MongoDB and discard some information has a last resort.  Beware of slow disks! Data persistence occurs every 60 seconds (default) and can take too much time, having a real impact on performance..  >10s to persist is usually very bad, something is wrong with hard drives.. 36
  • Data Collection - Evolution  Warnings  Which warnings to send? When? Thresholds?  Aggregate data by week, month, year.  Aggregate information in shorter intervals.  Data Mining algorithms applied to all the collected information.  Apply same principles to other feeds of the Stream.  Spam  Twitter  Etc.. 37
  • Reports  What‟s happening in country X?  What about network 192.168.0.1/24?  Can send me the report of Y everyday at 7 am?  Ohh!! Remember the report I asked last week?  Can I get a report for ASN AnubisNetwork? 38
  • Reports 39  HTTP API  Schedule  Get  Edit  Delete  List schedules  List reports  Check MongoDB for work.  Generate CSV report or store the JSON Document for later querying.  Send email with link to files when report is ready. Server Generator
  • Reports – MongoDB CSVs  Scheduled Report { "__v" : 0, "_id" : ObjectId("51d64e6d5e8fd0d145000008"), "active" : true, "asn_code" : "", "country" : "PT", "desc" : "Portugal Trojans", "emails" : "", "range" : "", "repeat" : true, "reports" : [ ObjectId("51d64e7037571bd24500000d"), ObjectId("51d741e8bcb161366600000c"), ObjectId("51d89367bcb161366600005f"), ObjectId("51d9e4f9bcb16136660000ca"), ObjectId("51db3678c3a15fc577000038"), ObjectId("51dc87e216eea97c20000007"), ObjectId("51ddd964a89164643b000001") ], "run_at" : ISODate("2013-07-11T22:00:00Z"), "scheduled_date" : ISODate("2013-07- 05T04:41:17.067Z") }  Report { "__v" : 0, "_id" : ObjectId("51d89367bcb161366600005f"), "date" : ISODate("2013-07-06T22:00:07.015Z"), "files" : [ ObjectId("51d89368bcb1613666000060") ], "work" : ObjectId("51d64e6d5e8fd0d145000008") }  Files  Each report has an array of files that represents the report.  Each file is stored in GridFS. 40
  • Reports – MongoDB JSONs  Scheduled Report { "__v" : 0, "_id" : ObjectId("51d64e6d5e8fd0d145000008"), "active" : true, "asn_code" : "", "country" : "PT", "desc" : "Portugal Trojans", "emails" : "", "range" : "", "repeat" : true, “snapshots" : [ ObjectId("521f761c0a45c3b00b000001"), ObjectId("521fb0848275044d420d392f"), ObjectId("52207c2f7c53a8494f010afa"), ObjectId("5221c9df4910ba3874000001"), ObjectId("522275724910ba3874001f66"), ObjectId("5223c6f24910ba3874003b7a"), ObjectId("522518734910ba3874005763") ], "run_at" : ISODate("2013-07-11T22:00:00Z"), "scheduled_date" : ISODate("2013-07-05T04:41:17.067Z") }  Snapshot { "_id" : ObjectId("51d89367bcb161366600005f"), "date" : ISODate("2013-07-06T22:00:07.015Z"), "work" : ObjectId("521f761c0a45c3b00b000001"), count: 123 }  Results { "machine" : { "trojan" : [ “conficker_b“ ], "ip" : "2.80.2.53", "host" : "Bl19-1-13.dsl.telepac.pt", }, … , "metadata" : { "work" : ObjectId("521f837647b8d3ba7d000001"), "snaptshot" : ObjectId("521f837aa669d0b87d000001"), "date" : ISODate("2013-08-29T00:00:00Z") }, } 41
  • Reports – Evolution  Other reports formats.  Charts?  Other type of reports. (Not only botnets).  Need to evolve Collector first. 42
  • Globe  How to visualize real time events from the stream?  Where are the botnets located?  Who‟s the most infected?  How many infections? 43
  • Globe – Stream  origin = banktrojan  Modules  Group  trojanfamily  _geo_env_remote_addr.country_n ame  grouptime=5000  Geo  Filter fields  trojanfamily  Geolocation  _geo_env_remote_addr.l*  KPI  trojanfamily  _geo_env_remote_addr.country_n ame  kpilimit = 10 44 Stream NodeJS Browser  Request botnets from stream
  • Globe – NodeJS 45 Stream NodeJS Browser  NodeJS  HTTP  Get JSON from Stream.  Socket.IO  Multiple protocol support (to bypass some proxys and handle old browsers).  Redis  Get real time number of infected machines.
  • Globe – Browser 46 Stream NodeJS Browser  Browser  Socket.IO Client  Real time apps.  Websockets and other types of transport.  WebGL  ThreeJS  Tween  jQuery  WebWorkers  Runs in the background.  Where to place the red dots?  Calculations from geolocation to 3D point goes here.
  • Globe – Evolution  Some kind of HUD to get better interaction and notifications.  Request actions by clicking in the globe.  Generate report of infected in that area.  Request operations in a specific that area.  Real time warnings  New Infections  Other types of warnings... 47
  • Adding Valuable Information to Stream Events  How to distribute workload to other machines?  Adding value to the information we already have. 48
  • Minions  Typically the operations that would had value are expensive in terms of resources  CPU  Bandwidth  Master-slave approach that distributes work among distributed slaves we called Minions. 49 Master Minion Minion Minion Minion
  • Minions 50  Master receives work from Requesters and store the work in MongoDB.  Minions request work.  Requesters receive real time information on the work from the Master or they can ask for work information at a later time. Process / Storage Minions Master MongoDB DNS Scan Minion Minion Requesters Minion
  • Minions  Master has an API that allows custom Requesters to ask for work and monitor the work.  Minion have a modular architecture  Easily create a custom module.  Information received from the Minions can then be processed by the Requesters and  Sent to the Stream  Saved on the database  Update existing database 51 Minion DNS Scanning Data Mining
  • Extras...  So what else could we possibly do using the Stream?  Distributed Portscanning  Distributed DNS Resolutions  Transmit images  Transmit videos  Realtime tools  Data agnostic. Throw stuff at it and it will deal with it. 52
  • Extras...  So what else could we possibly do using the Stream?  Distributed Portscanning  Distributed DNS Resolutions  Transmit images  Transmit videos  Realtime tools  Data agnostic. Throw stuff at it and it will deal with it. 53 FOCUS FOCUS
  • Portscanning  Portscanning done right…  Its not only about your portscanner being able to throw 1 billion packets per second.  Location = reliability of scans.  Distributed system for portscanning is much better. But its not just about having it distributed. Its about optimizing what it scans. 54
  • Portscanning 55
  • Portscanning 56
  • Portscanning 57
  • Portscanning IP Australia (intervolve) China (ChinaVPShosting) Russia (NQHost) USA (Ramnode) Portugal (Zon PT) 41.63.160.0/19 (Angola) 0 hosts up 0 hosts up 0 hosts up 0 hosts up 3 hosts up (sometimes) 5.1.96.0/21 (China) 10 hosts up 70 hosts up 40 hosts up 10 hosts up 40 hosts up 41.78.72.0/22 (Somalia) 0 hosts up 0 hosts up 0 hosts up 0 hosts up 33 hosts up 92.102.229.0/24 (Russia) 20 hosts up 100 hosts up 2 hosts up 2 hosts up 150 hosts up 58
  • Portscanning problems...  Doing portscanning correctly brings along certain problems.  If you are not HD Moore or Dan Kaminsky, resource wise you are gonna have a bad time 59
  • Portscanning problems...  Doing portscanning correctly brings along certain problems.  If you are not HD Moore or Dan Kaminsky, resource wise you are gonna have a bad time 60
  • Portscanning problems...  Doing portscanning correctly brings along certain problems.  If you are not HD Moore or Dan Kaminsky, resource wise you are gonna have a bad time  You need lots of minions in different parts of the world  Doesn‟t actually require an amazing CPU or RAM if you do it correctly.  Storing all that data...  Querying that data... Is it possible to have a cheap, distributed portscanning system? 61
  • Portscanning problems... 62 Minion
  • Portscanning 63
  • Data…. 64
  • Data 65
  • Internet status... 66
  • Internet status... 67
  • If we„re doing it... Anyone else can. Evil side? 68
  • Anubis StreamForce  Have cool ideas? Contact us  Access for Brucon participants: API Endpoint: http://brucon.cyberfeed.net:8080/stream?key=brucon 2013  Web UI Dashboard maker: http://brucon.cyberfeed.net:8080/webgui 69
  • Lol  Last minute testing 70
  • Questions? 71