SECURITY WHITEPAPERThe Role of DNS in Botnet Command and Control (C&C) DNS is powerful, ubiquitous and yet ignored by most organizations. Today, cybercriminals rely on DNS for rallying infected devices to join a botnet and to mitigate takedowns by authorities. In 2011, cybercriminals started covertly tunneling botnet communications over DNS traffic to mitigate detection by security solutions, despite security researchers widely publishing this threat in 2004!QUESTION: What do you know about 101.cnc.com? ANSWER: Analyze logs... RESULT: Post-damage forensics• Are any devices outside your network trying to resolve Stored on: • Locates infected devices delegated to be such domain hostnames through your network? • DNS servers, proxies or name servers for botnet C&C.• Are any devices within your network trying to • Web servers, or • Locates infected devices attempting to tunnel resolve hostnames to that domain? • Firewalls. botnet C&C communications over DNS.If you cannot answer the above questions, either because you build botnets to bypass firewall filters or Web proxies.3 Ethicaldon’t keep these logs, they’re not readily available, or you hackers have constructed a reverse shellcode exploit that couldwouldn’t know how to analyze them, you’re likely blind to provide cybercriminals VPN and remote access into an insecureinfected devices that have compromised your network by network using valid DNS syntax to avoid detection.4 Furthermore,performing these botnet command and control (C&C) activities. with the future adoptions of DKIM, IPv6 and other extensions to the basic DNS protocol, big and complex packets within DNSBotnet’s principal single point of failure and beacon to security traffic will become more common. Thereby assisting DNS-basedresearchers is its Internet-wide C&C architecture. From 2007-8, botnet C&C communications to more easily and efficiently blendcybercriminals began building distributed or hybrid C&C in since it’ll appear normal in DNS query streams (see page 2).topologies leveraging more advanced DNS-based C&C rallyingmechanisms, such as third-party dynamic or its own distributed In the arms race between cybercriminal organizations and theDNS services, to enable C&C communications to be redirected security community, C&C techniques have become so robust,through its own distributed proxy service. Infected devices within stealth and mobile that botnets are ubiquitous in both home andinsecure home or business networks host these services. DNS is business networks despite so-called “next generation” securityused to add robustness and mobility to remove single points of solutions’ best attempts to prevent all malware. The “defense-failure within the architecture and provide anonymity for the in-depth” strategy needs to migrate from adding preventioncybercriminals running botnet C&C servers (see page 2). Fluxing layers, to adding containment layers. DNS traffic is oftendomain names and/or the IP addresses in DNS records used by examined after security incidents; for example, Googlebotnets makes them more difficult for the security community to discovered the advanced and persistent “Aurora” botnet thattake down or over. breached its network by analyzing DNS logs after damage occurred. The most costly damage is no longer the lost time forToday, most botnets rely on a mix of P2P-, HTTP- or IRC-based IT to remediate infected devices, but the stolen data enclosingprotocols to communicate between bots and/or C&C servers. sensitive company or personal info for legal and regulatoryHowever, in late 2011, security researchers began publishing bodies to resolve.papers and blogs on botnets, such as “Morto”, “Feederbot” and“Katusha/Timestamper”, using a covert C&C communicationmethod known as DNS tunneling to add stealth.1 DNS tunneling “DEFENSE-IN-DEPTH” STRATEGY MIGRATIONis not new; it existed since 1998 and the first implementation DETECT MALWARE PREVENT MALWARE CONTAIN BOTNETSpublished by Slashdot in 2000. In 2004, Dan Kaminsky widelypresented his implementation to tunnel arbitrary data over DNSto the security community, but lost their short-term attention asother exploited DNS vulnerabilities, such as DNS cache INFECTED DEVICE / UNINFECTED DEVICE INFECTED DEVICE /poisoning, became more prevalent. 2 Today, many popular DNS INSECURE NETWORK / SECURE NETWORK SECURE NETWORKtunnels exist that are readily available for cybercriminals to 1 3 http://bit.ly/NSTX_DNS, http://bit.ly/OzymanDNS, http://bit.ly/TCP-over-DNS, http://bit.ly/Symantec_Morto, http://bit.ly/Dietrich_Feederbot, http://bit.ly/CHMag_Katusha http://bit.ly/Iodine_DNS, http://bit.ly/Dns2tcp, http://bit.ly/DNScat, http://bit.ly/DeNiSe2 http://bit.ly/Kaminsky_DNStunneling 4 http://bit.ly/Shellcode
1st prototype fully 1st successful P2P-based botnets Evolution of Botnet C&C P2P-based botnet 1st HTTP- 1st hybrid P2P/HTTP-based botnets 1st Web site/service- based botnets Web Services based botnets seed domain 1stIRC- IRC Domain IRC-based IRC-based based botnets flux Bots change host DNS settings flux crypto benign bot malicious bot botnet pervasive IP flux (double) DNS tunneling 1st fully DNS- DNS tunneling developed for cybercrime IP flux (single) based botnets P2P !! !! !! !! "! "! "! "! #! #! #! #! #! #! #! HTTP $! $! $! $! $! $! !! !! #! #! #! #! #! #! #! #! #! #! IRC $! !! !! !! !! "! "! "! "! "! "! #! #! #! #! #! #! #! #! #! #! #! #! #! #! #! DNS $! $! $! $! $! $! $! $! $! $! $! $! $! $! $! !! !! !! !! !! !! "! "! "! "! "! "! "! #! #! #! MALWARE (VIRUS, WORMS, TROJANS, ETC.) INFECTED DEVICES ARE CONNECTED BOTNETS ARE INFECTING DEVICES ARE ISOLATED TO FORM ROBOT NETWORKS (AKA. BOTNETS) UBIQITIOUS FUTURE 2001 2007 2011 2012 1987 1991 1997 2000 2002 2003 2004 2005 2006 2008 2009 2010 1983 1984 1985 1986 1988 1989 1990 1992 1993 1994 1995 1996 1998 1999 CENTRALIZED C&C TOPOLOGY DISTRIBUTED C&C TOPOLOGY HYBRID C&C TOPOLOGY* DNS-BASED RALLYING MECHANISMS HELP CYBERCRIMINALS STOP TAKEDOWNS BY REMOVING SINGLE POINTS OF FAILURE C&C RALLYING MECHANISMS DYNAMIC & (*one example) DISTRIBUTED DNS SERVICES DNS TRAFFIC HTTP TRAFFIC REDIRECTED REDIRECTED DNS-BASED C&C COMMUNICATION can be HELPS AVOID DETECTION BY BLENDING IN same bot ns1.cnc.tld ns2.cnc.tld 18.104.22.168 22.214.171.124 cnc.tld QUERY: QUERY: flux.cnc.tld RESPONSE: HTTP GET RESPONSE: ONLY ALLOW BASIC 126.96.36.199 C&C PORT 80/443 PORT 53 RESOLVERS RESPONSE: NO SINKHOLE 188.8.131.52 NO PROXY NO FILTER QUERY: QUERY: QUERY: QUERY: flux.cnc.tld LEAK DATA = where is where is where is 11010 + 01010 11010 + 01010 00110. 01010. 11010. + … 00110 QUERY: flux.cnc.tld DISTRIBUTED + … 00110 cnc.tld? cnc.tld? cnc.tld? = DATA STOLEN REFERRER: PROXY 01110 + 11011 RESPONSE: RESPONSE: RESPONSE: COMMAND = ns1.cnc.tld SERVICES + … 11100 00110. 01010. 11010. 01110 + 11011 = CONTROL cnc.tld is cnc.tld is cnc.tld is + … 11100 at 01110 at 11011 at 11100 DNS TUNNELING FOR COVERT C&C COMMUNICATIONS
The Past, Present and Future of Significant Botnet C&C Techniques C&C Attributes Past Present Future Centralized topology Distributed or hybrid topology using RALLYING MECHANISMS using static IP lists domain flux and/or IP flux (via DNS records) > Static Lists IP addresses Domain names and/or IP addresses Dynamic content hidden on popular websites (e.g. > Domain Flux > Seeding Predictable timestamp Twitter trends) that can be customized in do-it- yourself kits > Domain Flux > Crypto Static Frequently changing > Domain Flux > Names Random characters Dictionary word combinations > Domain Flux > Volume Hundreds of domains Tens of thousands of domains Single flux networks changing A Double flux networks changing both A and NS > IP Flux > Records resource records (first seen in the resource records (first seen in the Asprov botnet in Storm/Peacomm botnets in 2007) 2008) Existing dynamic DNS services or As dynamic DNS services are taking a more “personalized” third-level domain (3LD) aggressive stance against botnet abuse, and services. Alternatively, custom DNS governments are cooperating quicker with the servers on bulletproof hosts, which security community, cybercriminals are building their > IP Flux > Service allows a cybercriminal to bypass the own distributed DNS services using multiple laws or contractual terms of service compromised hosts. Often these are initially regulating Internet content and service bootstrapped via custom DNS servers on bulletproof use in its country of operation and are hosts. unlikely to cooperate with authorities. Distributed or hybrid Hybrid topology with Centralized topology usingCOMMUNICATION topology using P2P-and/or protocol tunneling such IRC- or HTTP-based protocols HTTP-based protocols as DNS traffic > IRC > Client Common IRC client Cybercriminal’s custom IRC client Paid do-it-yourself malware exploit kits Paid or open-source do-it-yourself botnet kits > HTTP > DIY Kits (e.g. Mpack, ICEPack, Fiesta) (e.g. Zeus, SpyEye, TDSS) > HTTP > Protocol Unencrypted Encrypted Public Web 2.0 services (e.g. Amazon Elastic > HTTP > Hosts Privately owned Web servers Compute Cloud, Google App Engine) and social network sites (e.g. Twitter, Facebook, Google Groups) Non-standard port numbers used by P2P standard ports numbers used by common encrypted > P2P > Port protocols protocols (e.g. SSH, HTTPS) > P2P > Protocol Unauthenticated Authenticated > P2P > Discovery Centralized in cache servers Distribute hashed tables across the network Trickled, non- Phone home, data consecutive DNS > DNS Not used exfiltration and/or bot queries over long time instructions periods to further mitigate detection
C&C RALLYING MECHANISM DESCRIPTIONSThe rallying mechanism enables new bots to locate its peers IP Fluxor the C&C servers and join the botnet. While rallying can Modern botnets primarily use one or more hard-codedalso be related to botnet recruitment and propagation, the domain names for DNS servers to resolve to many different IPfollowing mechanisms are only for the purposes of addresses over a short span of time. This technique is alsonetworking the bots. widely known as “Fast Flux” Service Networks (FFSN) as it’sIf the security community is 100% successful in shutting also associated with spam and phishing attacks. However,down or hijacking the rallying mechanisms, the botnet falls the term “IP Flux” best describes the result of rapidlyapart into a benign collection of discrete, unorganized changing the location (i.e. IP address) to which the domaininfections. However, if even a few C&C servers remain alive, name of an Internet host (A) or authoritative name serverthe botnet can adapt and reconfigure itself to be undetected (NS) resolves, caused by rapid and repeated changes to DNSor protected behind the virtual walls of international records using very low time-to-live (TTL) cache settings.jurisdiction. Several movie analogies come to mind such as Relative to using IP lists, taking down malicious DNS recordsTerminator’s shape-shifting T-1000 series cyborg or Star is often more difficult than compromised IP addressesTrek’s Borg collective; both these entities are very resilient because many records can be established for the same orunless the entire control mechanism is eliminated. Today, many IP addresses.botnets use a hybrid of up to all three of the following These locations are actually a network of compromised hoststechniques, where one may initiate the rallying, one that act as front-end nodes to proxy DNS and C&Cmaintains the rallying, and another backs up the rallying if communication protocols to a group of backend C&C servers,the other one or two are disrupted. commonly referred to as a “fast flux mothership” (see page 2). This second layer of abstraction further increasesStatic Lists anonymity, security, high availability and load balancing ofEarly botnets primarily used hardcoded static lists of IP the botnet. It makes it nearly impossible to filter only by IPaddresses or domain names. However, many firewalls can address, ASNs or geo-location and adds resiliency toadd an optional feed of known bad IP addresses to help takedown attempts as it shifts the centralizing agent ofmitigate this legacy technique and it is often not agile control from the C&C servers to the distributed DNSenough for today’s large botnet operations. While some architecture. In many ways the idea is comparable to Contentcompromised hosts will initially rely on static IPs to Delivery Networks (CDN). It has evolved and advanced sincebootstrap communications with the botnet, they then switch the The Honeynet Project Research Alliance first discoveredto one of the following, more robust methods. For added its use.mobility, cybercriminals used domain names with round- The evolution for cybercriminals to use their ownrobin/multi-homing techniques to associate multiple IP authoritative name servers has added greater robustnessaddresses with a single DNS record or dynamic DNS services, and mobility to IP Flux, and makes successful takedown morebut not abusing them via IP flux, which is described next. difficult for the security community. Alternatively, if the compromised devices are redirected to the cybercriminalsDomain Flux own recursive DNS servers, bots are able to resolve domainThe botnet uses cryptographically generated domain names names to different IP addresses relative to the rest of theby a Domain Generation Algorithm (DGA), which makes it Internet, so for example, if a security researcher or othermore difficult for static reputation systems to maintain an network device tries to access the domain, it may appear toaccurate list of all possible C&C domains or for the security not exist. Also, it allows the bot to resolve well-known domaincommunity to attempt to hijack the domain. Many names (e.g. google.com) to C&C servers.cybercriminals register only a few of the possible generateddomains at a time using dynamic DNS services. In limitedrecent cases such as the “Android bot”, URL Flux has beenused, which is similar to domain flux in that the bot uses alist of usernames generated by a Username GenerationAlgorithm (UGA) from which it selects a username to visit ona Web 2.0 site.
C&C COMMUNICATION DESCRIPTIONSOnce the bots have joined the botnet, they regularly maintain the century, many first-generation cybercriminals were verycommunications to receive new commands, send back data familiar with IRC as a simple, synchronized and scalableto the C&C servers, such as sensitive company or personal means to chat between thousands of hosts so it was naturalinformation, or learn how to adapt itself in response to the evolution to utilize it for the first C&C communications insecurity community’s efforts to disrupt or take down its 1999. Despite the advent of instant-messaging (IM)operations. There are advantages and disadvantages as the protocols such as ICQ, AIM, and MSN Messenger that gainedfollowing table explains. popularity over IRC for the masses, many “old school” networking and security professionals still use IRC. In fact, Evolution Past Present the original C&C functionality of three evolved IRC-based bot Distributed or hybrid, yet families – Agobot, SDBot, and GTBot – still constitute a large Topology Centralized many are still centralized percent of today’s botnet infections especially since some of Protocols IRC or HTTP P2P the source code was published by its author, with occasional Setup Easy Hard infections by variants of the DSNX, Q8, kaiten, and Perlbot Detection Easy Hard IRC-based families. While almost the same in principal to IRC, there have been only a few botnets based on IM Communication Small delays Small to medium delays protocols due to the difficulty of creating individual IM Resiliency Bad Good accounts for each bot. Anonymity Bad Good Centralized Communications via HTTP-based ProtocolsBased on the communication topology, different push and However, as the security community adapted to use networkpull control mechanisms will be used together with the firewalls to block seldom used or unnecessary ports at thecommunication protocol. Also, command authentication can Internet gateway, cybercriminals realized that a morebe added to the communication protocol such as passwords ubiquitous C&C protocol was needed to blend in with normalor encryption certificates to help mitigate outsiders taking user traffic. Ports 80 and 443 used for unencrypted andcommand over the botnet from the cybercriminals; especially encrypted Web traffic over HTTP/S is almost universallywith P2P-based protocols. allowed through firewalls, and a few GET and POST requests Direction / used for C&C can easily be lost amongst the exponentially Topology Centralized Distributed growing volume of legitimate Web traffic. HTTP-based DDoS & spam botnets greatly accelerated with advances in do-it-yourself Push IRC-based protocols attacks kits developed mainly by professional Russian cybercriminals HTTP-based protocols, IP Flux P2P-based to aspiring amateur cybercriminals, and in mid-2011 several Pull botnet kits were leaked. Recently, public or social Web rallying mechanisms protocols services have been gaining popularity as C&C hosts via obfuscated commands due to their added anonymity,Centralized Topologies openness and scalability. However, the security researchAll early botnets and still the majority of botnets today use community can also leverage this openness to quickly shutcentralized topologies via HTTP-based, IRC-based or other such botnets down. IDS/IPS solutions can often detectprotocols because they are easier to setup and ensure that suspicious URI strings or nonstandard HTTP headers (e.g.new commands are disseminated to large botnet populations Entity-Info, Magic-Number) used by botnets (e.g. Bredolab).quickly. However, centralized C&C servers are easier todetect and become a single point of failure for the botnet Centralized Communications via Other Protocols(see page 2). FTP isn‘t commonly seen in the wild; however, several phishing or banking Trojan horses regularly drops off stolenCentralized Communications via IRC-based Protocols data to FTP servers. Some botnets use custom UDP-onlyOnly one year after the IRC protocol was invented in 1988 protocols, which while easily blocked by business networks,programmers created the first bots to enable chat room (aka. often are able to bypass misconfigured firewalls.channel) operators to log in, ensure the channel remainedopen, and to give them non-malicious control. At the turn of
Distributed Topologies (via P2P-based protocols) Hybrid TopologiesPeer-to-peer (P2P) communications were created to Advanced hybrid, hierarchal C&C architectures combine thedistribute file sharing (e.g. MP3s) amongst large stealth from a few centralized C&C servers and robustnesspopulations. From 1999 to 2003, P2P topologies and from distributed peers to prevent take down. For example,protocols quickly evolved to add robustness, stealth and one group of bots act as servants since they behave as bothmobility from the recording industry’s and ISP’s attempts to clients and servers, which have static, non-private IPdisrupt communications and/or prosecute guilty individuals; addresses and is accessible from the global Internet. Theexactly what cybercriminals also seek for their botnet C&C second group of bots only act as clients since they don’tcommunications. Using structured P2P communications as a accept incoming connections. The second group contains theC&C topology was first envisioned as early as 2000, but the remaining bots, including: (1) bots with dynamic IPfirst botnets to use it appeared in 2003, the security research addresses; (2) private IP addresses; or (3) bots behindcommunity began to publish its use in 2005, and it wasn’t firewalls such that they cannot be connected from the globaluntil 2006 that they achieved some limited success. The bots Internet. Only servant bots are candidates in peer lists.are able to loosely communicate amongst its peers using the Another example, is the Hierarchical Kademlia bot, whichsame or similar non-RFC TCP, UDP (used to bypass NAT extends the base Kademlia bot. Each level in the hierarchysituations) or encrypted ICMP protocols as many file sharing consists of a set of clusters or islands of bots. These clustersclients (see page 2). This topology offers the botnet better use Kademlia for intra-cluster communication. Each clusteranonymity and resiliency without any single points of failure has a super peer that is responsible for communicating withat the expense of higher setup overhead and communication other super peers in the next level up in the hierarchy. Thelatency. However, since the knowledge about participating super peers thus facilitate inter-cluster communication (seepeers is distributed throughout the botnet itself, which gives page 2).the security research community equal access to thisinformation, cybercriminals evolved the standard P2Pprotocols to include proprietary authentications.A future evolution for P2P-based botnet C&C would be toblend in with common encrypted P2P protocol trafficubiquitously within business networks. Fortunately, only oneprotocol really exists today; Skype. Despite known malwareinstances using Skype plugins and its API, to the best of thesecurity community’s knowledge, Skype-based botnets arestill exclusively theoretical. In 2005, researchers presentedan extremely distributed C&C topology using random,unstructured P2P communications broadcast to any otheravailable peers. While one of the very first experimental P2Pbotnets in 2003 had used such a method, it was notsuccessful, and no other botnets have since been reported touse this topology.Overall, despite the advancements that cybercriminals havedeveloped, some of the oldest botnet C&C communicationtechniques are still being used today due to their availabilityvia open or leaked source code, or do-it-yourself kits. Thetable below provides a few data C&C Apr 2008 2008 2009 Q2 2010 2011points published by the security Communications Arbor Networks Symantec Symantec Microsoft govcert.nlcommunity over the past few years. Centralized / IRC 90% 44% 31% 38.2% 30% Centralized / HTTP 4% 57% 69% 29.1% Distributed / P2P 5% n/a n/a 2.3% 70% Other 1%` n/a n/a 30.5%
DNS-based Communications within Any Topology Notable Quote from Ed Skoudis, Founder of Counter HackEssentially, DNS records are abused to traffic data in and out Challenges and SANS Fellow (Feb 2012)of a network. Every type of record (NULL, TXT, SRV, MX, “Number of malware threats that receive instructions fromCNAME or A) can be used, but the speed of the connection attackers through DNS is expected to increase, and most companies are not currently scanning for such activity ondiffers by the amount of data that can be stored in a single their networks, security experts said at the RSA Conferencerecord (see page 2). 2012 on Tuesday. While most malware-generated traffic passing through most channels used for communicatingThe outbound phase starts with the bot on the compromised with botnets (such as TCP, IRC, HTTP or Twitter feeds anddevice requesting a response from the local host or network Facebook walls) can be detected and blocked, its not the case for DNS (Domain Name System) and attackers areDNS server for a DNS query to [data].cnc-domain.tld. The taking advantage of that.”data (base32-encoded) is split and placed in the third- and http://www.circleid.com/posts/malware_increasingly_uses_dns_as_command_and_control_channel/lower-level domain name labels of multiple queries. Sincethere will be no cached response on either local DNS server,the requests are forwarded to the ISP’s recursive DNSservers, which in turn will get responses from thecybercriminal’s authoritative name server.For the inbound phase, TXT records can store the most data(base64-encoded) as typically suggested in DNS tunnelimplementations up to 110 kbps, but may not be ideal forbotnets to avoid detection by network devices since these arenot common records. Unfortunately simply blocking TXTrecords as a defense method is insufficient, because it willbreak other protocols (e.g. SPF, DKIM) and alternative DNSrecords such as CNAME are common, and used in series, canstill transmit detailed instructions for the compromised hostto act on.Alternatively, if two-way communication is not necessary,either the queries or responses can exclude the encodedoutbound or inbound data, respectively. This would make thetransfer more inconspicuous to avoid anomaly detectionsystems.At present time, there are not many countermeasures citedby the security community that are “silver bullets” to detectDNS-based botnet C&C communications. While some larger,security-aware organizations could use techniques such as“split horizon” DNS to force internal hosts to send their DNSrequests only through the network DNS server and then usestatistical anomaly detection (aka. signatures) for this DNStraffic, there are unfortunately little to no readily-availablesignatures that are well tested to both guarantee protectionand cause no false positives.