Taming BotnetsLife cycle and detection of bot infections through network traffic analysis
agenda● Introduction● Bots and botnets: short walk-through● Taming botnets: Detection and Evasion● Our approach● Case studies● Conclusion● Disclaimer: We steal our images From google image :)
Introduction● Why we are doing this research?● Objectives● Our data sources● Our environment bunch of code in node.js and python. Customized sandboxing platform (cuckoo based). Data indexed in solr
Introduction: bots● “bot”: a software program, installed on target machine(s) for the purpose of utilizing that machine computational/network resources or collect information● A typical bot is controlled by external party therefore needs to be able to utilize a communication channel in order to receive commands and pass information● Bots typically are used for malicious purposes ;-)
Introduction: bots (lifecycle)● Installation (infection) phase: often by means of a software exploit or a social engineering technique (fake antivirus, fake software update)● Post-infection phase: communication (C&C, peer etc)
Introduction● Our basic assumption is that a bot needs to be able to communicate back in order to be useful.● Our analysis is primarily “blackbox” by observing network traffic of a large network infrastructure in order to identify possible infections and “communication” links● We also utilize sandboxing techniques to observe behavior (mainly from the network side)● We do not attempt to reverse engineer (manually or automatically) botnet software
Botnets● Infection vectors → often targetting enduser machines (clients) in large number of occurrences by exploiting a software vulnerability in browser or related components● C&C communication: ● Remember IRC bots? :) ● over HTTP (most common) ● Proprietary protocol ● Centralized or P2P infrastructure
Botnets: lifecycle● C&C Hosting itself is another interesting research area ;-)
How do you get bots (pt 2)● SEO poisoning/manipulation.
How you get bots (pt 3)● Advertisements and malvertisements: whole new ecosystem: OpenX is a huge security hole ;)
Anyways● Once infected, the bot talks back... Lets look at some real-life cases. (data is very recently, mostly past few months).
Old-school bots (still active. For real!May/2012: IRC bots still real :-D ;-))
Carberp● Bot Infection: Drive-By-HTTP● Payload and intermediate malware domains: normal, just registered/DynDNS● Distributed via: Many many compromised web-sites, top score > 100 compromised resources detected during 1 week.● C&C domains usually generated, but some special cases below ;-).● C&C and Malware domains located on the same AS (from bot point of view). Easy to detect.● Typical bot activity: Mass HTTP Post
Detection during infection and by postinfection activity● Infection: executable transfer from just registered, example lifenews-sport.org or Dyn-DNS domains, like uphchtxmji.homelinux.com● Updates: executable transfer from just registered or DynDNS domain● Postinfection activity: Mass HTTP Post to generated domains like n87e0wfoghoucjfe0id.org, URL ends with different extensions
Netprotocol.exe● Bot Infection was: Drive-By-FTP, now: Drive-By-FTP, Drive-By-HTTP● Payload and intermediate malware domains:Normal, Obfuscated● Distributed via: compromised web-sites● C&C domains usually generated, many domains in .be zone.● C&C and Malware domains located on the different AS. Bot updates payload via HTTP● Typical bot activity: HTTP Post, payload updates via HTTP.
Attack analysis- Script from www. Java.com used during attack.- Applet exp.jar loaded by FTP- FTP Server IP address obfuscated to avoid detection
Interesting modificationsGET http://java.com/ru/download/windows_ie.jsp?host=java.com%26returnPage=ftp://188.8.131.52/1/s.html%26locale=ru HTTP/1.1 Key feature exampleDate/Time 2012-04-20 11:11:49 MSDTag Name FTP_PassTarget IP Address 184.108.40.206Target Object Name 21:password Java1.6.0_30@:user anonymous
Activity exampleDate/Time 2012-04-29 Date/Time 2012-04-2902:05:48 MSD 02:06:08 MSDTag Name HTTP_Post Tag Name HTTP_PostTarget IP Address Target IP Address220.127.116.11 18.104.22.168:server :serverrugtif.be eksyghskgsbakrys.com● :URL :URL /check_system.php /check_system.php Domain registered: 2012-04-21
Onhost deteciton and activity Payload: usually netprotocol.exe. Located in UsersUSER_NAMEAppDataRoaming, which periodically downloads other malwareFurther payload loaded via HTTP http://22.214.171.124/view_img.php?c=4& k=a4422297a462ec0f01b83bc96068e064
Detection By AV Sample from May 09 2012 Detect ratio 1/42● (demos, recoreded as videos)
Detection during infection and by postinfection activity● Infection: .jar and .dat file downloaded by FTP, server name = obfuscated IP Addres, example ftp://3645456330/6/e.jar Java version in FTP password, example Java1.6.0_29@● Updates: executable transfer from some Internet host, example GET http://126.96.36.199/f/kwe.exe● Postinfection activity: Mass HTTP Post to normal and generated domains with URL: check_system.php 09:04:46 POST http://hander.be/check_system.php 09:05:06 POST http://aratecti.be/check_system.php 09:06:48 POST http://hander.be/check_system.php 09:07:11 POST http://aratecti.be/check_system.php
Noproblemslove.com, whoismistergreen.com, etc...● Bot Infection: Drive-By-HTTP● Payload and intermediate malware domains:Normal /DynDNS● Distributed via: Compromised web-sites.● C&C domains: normal.● C&C and Malware domains located on the different AS. Sophisticated attack scheme. Timeout before activity.● Typical bot activity: Mass HTTP Post
HOSTER RANGE AND ASwww.google-analylics.com looks good, BUTGoogle, Rambler and Yandex together on 188.8.131.52/29 ?hoster range and autonomous system (AS)are useful, when you analyze suspicious events.
Whats commonwhoismistergreen.com noproblemslove.comIP-адрес: 184.108.40.206 220.127.116.11Create: 2011-07-26 Created: 2011-12-07Registrant Name: JOHN Registrant Contact:ABRAHAM Whois Privacy Protection ServiceAddress: ul. Dubois 119 Whois AgentCity: Lodz firstname.lastname@example.org noproblemsbro.comIP Was 18.104.22.168 22.214.171.124IP Now 126.96.36.199 Created: 2011-12-07 Registrant Contact:Create: 2011-07-21 Whois Privacy Protection ServiceRegistrant Name: patrick jane Whois AgentAddress: ul. Dubois 119 email@example.comCity: Lodz
Detection during infection and by postinfection activity● Infection: executable transfer from just registered, or Dyn-DNS domains, like fx58.ddns.us● Updates: application/octet-stream bulk data load from C&C● Postinfection activity: Mass HTTP Post to seem-normal domains,i.e: noproblemslove.com, whoismistergreen.com, etc...
Cross-correlation data sources● WHOIS (including team cymru whois)● Our own DNS index, also talking to ISC about possibilities of data swaps● Sandbox farm (mainly to detect compromised websites automagically and study behavior)● Public “malicious IP address” databases.● Public reputation (I.e ToS) databases. ● (still work in progress)
Detection● Manual and Automated● Automated detection is largely based on analysis of network traffic: ● Anomaly detection ● Pattern based-analysis ● Signatures (snort!) ● Traffic profiling (DNS traffic profiling, HTTP traffic profiling etc)
Detection● Detecting malicious botnet activity is very popular in academia (interesting problem).● In our research we do not claim extreme novelty but rather will demonstrate our experience and a few practical solutions that seem to work :-)
Detection: intreresting bits● Botnet detection evolved from pattern based approach (hardcoded bot CMD patterns and capture then with snort) to a complex field of generic detection of automated “call-back” communication channels..
Detection● Different “callback” methods, as seen in the wild, possess interesting properties, such as: ● Large number of failed DNS requests ● Large number of DNS requests for IP addresses, which are offline ● Connection attempts to mostly dead IP addresses ● Traffic pattern (differs from regular browsing)
Cat and mouse game● Of course all of this is easy to evade. Once you know the method. But security is always about cat-n-mouse game ;-)
Detection● Detecting botnet activities by analyzing DNS traffic ● Analyzing DNS names (dictionary-comparison, alpha numeric characters, detection of “generated” domain names (similarities/patterns) ● Analyzing failed DNS queries ● DNS “ranking” (based on whois information)
Detection● Further step: cross-correlation to domain names which have the same WHOIS attributes● Sandboxing (we use modified version of cuckoosandbox, with user event simulation, not perfect but works) ● Challenges: – Simulate complex user behavior (mouse movements) – Simulate complex user browsing pattern (visiting X with search engine (image?) as referer)
Detection (visualization)● Parallel coordinates (also see recent talk by Alexandre Dulaunoy from CIRCL.LU and Sebastien Tricaud from Picviz Labs at cansectwest)
Detection● (demos, lets look at some videos :)
Conclusions● Detection is still trivial, but keep your methods “private” ;-)● Detecting advanced botnets (name your favourite traffic profiling evasion method!) is out of question here. Unless this becomes wide- spread● Cat and mouse game is still fun! ;-)
Tips and recommendations● For infected machines: boot from clean media and periodically do OFFLINE AV checking● Monitor network traffic for any unusual activity● Default-deny firewall policies + block any active executable content
questions● Contact us at: ● firstname.lastname@example.org ● email@example.com http://github.com/fygrave/dnslyzer for some code