Honeypot Essentials


Published on

The paper covers honeypot (and honeynet) basics and definitions and then outlines important implementation and setup guidelines. It also describes some of the security lessons a company can derive from running a honeypot, based on the author experience running a research honeypot. The article also provides insights on techniques of the attackers and concludes with considerations useful for answering the question “Should your organization deploy a honeynet?”

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Honeypot Essentials

  1. 1. Honeypot Essentials Anton Chuvakin, Ph.D., GCIA, GCIH WRITTEN: 2003 DISCLAIMER: Security is a rapidly changing field of human endeavor. Threats we face literally change every day; moreover, many security professionals consider the rate of change to be accelerating. On top of that, to be able to stay in touch with such ever-changing reality, one has to evolve with the space as well. Thus, even though I hope that this document will be useful for to my readers, please keep in mind that is was possibly written years ago. Also, keep in mind that some of the URL might have gone 404, please Google around. Summary: The paper covers honeypot (and honeynet) basics and definitions and then outlines important implementation and setup guidelines. It also describes some of the security lessons a company can derive from running a honeypot, based on the author experience running a research honeypot. The article also provides insights on techniques of the attackers and concludes with considerations useful for answering the question “Should your organization deploy a honeynet?” I. While known to security processionals for a long time, honeypots recently became a hot topic in information security. However, the amount of technical information available on their setup, configuration, and maintenance is still sparse as are qualified people able to run them. In addition, higher-level guidelines (such as need and business case determination) are similarly absent. This paper will cover some of the honeypot (and honeynet) basics and definitions and then will outline important implementation issues. It will also cover security lessons a company can derive from running a research honeypot. What is a honeypot? Lance Spitzner, a founder of Honeynet Project (http://www.honeynet.org) defines a honeypot as "a security resource who's value lies in being probed, attacked or compromised". The Project differentiates between research and production honeypots. The former are focused on gaining intelligence information about attackers and their technologies and methods, while the latter are aimed at decreasing the risk to a company IT resources and providing advance warning about the incoming attacks on the network infrastructure. Honeypots of any kind are hard to classify using the
  2. 2. "prevention - detection - response" metaphor, but it is hoped that after reading this paper their value will become clearer to readers. This paper will focus on operating a research honeypot or a "honeynet". The term "honeynet", used in this article, originated in the Honeynet Project and means a network of systems with fairly standard configurations connected to the Internet. The only difference of such network from a regular production network is that all communication is recorded and analyzed, and no attacks targeted at third parties can escape the network. Sometimes, the system software is slightly modified to help deal with encrypted communication, often used by attackers. The systems are never "weakened" for easier hacking, but are often deployed in default configurations with minimum of security patches. They might or might not have known security holes. Honeynet Project defines such honeypots as "high-interaction" honeypots, meaning that attackers interact with a deception system exactly as they would with a real victim machine. On the other hand, various honeypot and deception daemons are "low-interaction", since they only provide an illusion to an attacker, and the one that can hold their attention for a short time only. Such honeypots has value as an early attack indicator, but do not yield in-depth information about the attackers. Research honeypots are set up with no extra effort to lure attackers - blackhats locate and exploit systems on their own. It happens due to the widespread use of automatic hacking tools, such as fast multiple vulnerability scanners and automatic penetration scripts. For example, an attacker from our honeynet has attempted to scan 200,000 systems for a single FTP vulnerability in one night using such tools. Research honeypots are also unlikely to be used for prosecuting intruders, however researchers are known to track hacker activities using various covert techniques for a long time after the intruder broke into their honeypot. In addition, prosecution based on honeypot evidence has never been tested in the court of law. It is still wise to involve a company legal team before setting up such hacker study project. Overall, the honeypot is the best tool for looking into the malicious hacker activity. The reason for that is simple: all communication to and from the honeynet is malicious by definition. No data filtering, no false positives and no false negatives (the latter only if the data analysis is adequate) are obscuring the picture. Watching the honeypot provides insight on intruders' personalities and can be used to profile attackers. For example, in recent past the majority of penetrated Linux honeypots are hacked by Romanian attackers. What are some of the common sense prerequisites for running a honeynet? First, honeypot is a sophisticated security project, and it makes sense to take care of security basics first. If your firewall crashes or your intrusion detection system misses attacks, you are clearly not yet ready for a honeypot deployment. Running a honeypot also requires advanced knowledge in computer security. After running honeynet for my employer, a member of Honeynet Research Alliance, I can state that operating a honeynet presents an ultimate challenge a security professional can face. The reason is simple: no "lock it down
  3. 3. and maintain secure state" model is possible for such a deception network. It requires an in-depth expertise in many security technologies and beyond. Some of the technical requirements follow. Apparently, the honeypot systems should not be allowed to attack other systems or, at least, such ability should be minimized. This requirement often conflicts with a desire to create a more realistic environment for malicious hackers to "feel at home" so that they manifest a full spectrum of their behavior. Related to the above is a need for a proper separation of research honey network from a company production machines. In addition to protecting innocent third parties, similar measures should be utilized to prevent attacks against your own systems from your honeypot. Honeypot systems should also have reliable out-of-band management. The main reason for having this capability is to be able to quickly cut off the network access to and from the honeypot in case of emergency (and they do happen!) even if the main network connection is saturated by an attack. That sounds contradictory with the above statement about preventing outgoing attacks, but Murphy Law might play a trick or two and “human errors” can never be totally excluded. Honeynet Research Alliance (http://www.honeynet.org/alliance/) has guidelines on Data Control and Data Capture for the deployed honeynet. They distill the above ideas and guidelines into a well-written document "Honeynet Definitions, Requirements, and Standards" (http://www.honeynet.org/alliance/requirements.html) . The document establishes some "rules of the game", which has a direct influence on honeynet firewall rule sets and IDS policies. Data Control is a capability required to control the network traffic flow in and out of the honeynet in order to contain the blackhat actions within the defined policy. For example, rules such as 'no outgoing connections', 'limited number of outgoing connection per time unit', ‘only specific protocols and/or locations for outgoing connections’, 'limited bandwidth of outgoing connections', 'attack string filtering in outgoing connections' or their combination can be used on a honeynet. Data Control functionality should be multilayered, allow for manual and automatic intervention (such as remote disabling of the honeypot) and should make every effort to protect innocent third parties from becoming victims of attacks launched from the honeynet (and launched they will be!). Data Capture defines the information that should be captured on the honeypot system for future analysis, data retention policies and standardized data formats which facilitate information sharing between the honeynets and cross-honeynet data processing. Cross- honeypot correlation is an extremely promising area of future research, since it allows for a creation of an early warning system about new exploits and attacks. Data Capture also covers the proper separation of honeypots from production networks to protect the attack data from being contaminated by the regular network traffic. Another important aspect of Data Capture is timely documentation of attacks and other incidents occurring in the honeypot. It is crucial for research to have a well-written log of malicious activities and configuration changes performed on the honeypot system.
  4. 4. II. Lets turn to practical aspects of running a honeynet. Our example setup consists of three hosts (see diagram): a victim host, a firewall and an IDS. This is the simplest configuration to maintain, however, a workable honeynet can even be set up on a single machine if virtual environment (such as VMware or UML-Linux) is used. Combining IDS and firewall functionality by using a gateway IDS (such as “snort-inline”) allows one to reduce the requirement to just two machines. A gateway IDS is a host with two network cards that analyzes the traffic passing through it and can make packet-forwarding decisions (like a firewall) and send alerts based on network packet contents (like an IDS). Currently, the honeynet uses Linux on all systems, but various other UNIX flavors will be deployed as "victim" servers by the time this paper is published. Linux machines in default configurations are hacked often enough to provide a steady stream of data on blackhat activity. "Root"-level system penetration within hours of being deployed is not unheard of. UNIX also provides a safe choice for a victim system OS due to its higher transparency and ease of reproducing a given configuration. The honeypot is run on a separate network connection – always a good idea since the deception systems should not be seen as owned by your organization. Firewall (hardened Linux "iptables" stateful firewall) allows and logs all the inbound connections to the honeypot machines and limits the outgoing traffic depending upon the protocol (with full logging as well). It also blocks all IP spoofing attempts and fragmented packets, often used to conceal the source of a connection or launch a denial-of-service attack. Firewall also protects the analysis network from attacks originating from the honeypot. In fact, in the above setup, an attacker has to pierce two firewalls to get to the analysis network. IDS machine is also firewalled, hardened and runs no services accessible from the untrusted network. The part of the rule set relevant to protecting the analysis network is very simple: no connections are allowed from the untrusted LAN to an analysis network. IDS (Snort from www.snort.org) records all network traffic to a database and a binary traffic file via a stealth IP-less interface and also sends alerts on all known attacks detected by its wide signature base (approximately 1650 signatures as of July 2002). In addition, specially designed software is used to monitor the intruder's keystrokes and covertly send them to a monitoring station. All data capture and data control functionality is duplicated as per Honeynet Project requirements. 'Tcpdump' tool is used as the secondary data capture facility, bandwidth- limiting device serves as the second layer of data control and the stealth kernel-level key logger backs up the keystroke recording. Numerous automated monitoring tools, some custom-designed for the environment, are watching the honeypot network for alerts and suspicious traffic patterns. Data analysis is crucial for the honeypot environment. The evidence, in the form of system, firewall and IDS log files, IDS alerts, keystroke captures and full traffic captures, is generated in overwhelming amount. Events are correlated and suspicious ones are analyzed using the full packet dumps. It is highly recommended to synchronize the time via Network Time Protocol on all the honeypot servers for more reliable data correlation.
  5. 5. SIM software can be used to enable advanced data correlation and analysis, as well logging the compromises using the Incident Resolution Management system. Unlike in the production environment, having traffic data available in the honeypot is extremely helpful. It also allows for reliable recognition of new attacks. For example, a Solaris attack on "dtspcd" daemon (TCP port 6112) was first captured in one of the Project's honeypots and then reported to CERT. Several new attacks against Linux samba1 servers were also detected recently The above setup has gone through many system compromises, several massive outbound denial-of-service attacks (all blocked by the firewall!), major system vulnerability scanning, serving as an Internet Relay Chat server for Romanian hackers and other exciting stuff. It passed with flying colors through all the above "adventures" and can be recommended for deployment. III. What insights have we gained about the attacking side from running the honeynet? It is true that most of the attackers "caught" in such honeynets are "script kiddies" i.e. the less enlightened part of hacker community. While famous early honeypot stories (such as those described in Bill Cheswick's "An Evening with Berferd" and Cliff Stolls' "Cuckoo's Nest") dealt with advanced attackers, most of your honeypot experience will probably be related to "scipt kiddies". Opposite to common wisdom, companies do have something to fear from the script kiddies. The number of scans and attacks aimed by the attackers at Internet-facing networks ensures that any minor mistake in network security configuration will be discovered fairly soon. Every unsecured server, running a popular operating system (such as Solaris, Linux or Windows) will be taken over fairly soon. Default configurations and bugs in services (UNIX/Linux ssh, bind, ftpd and now even Apache web server and Windows IIS are primary examples) are the reason. We have captured and analyzed multiple attack tools using the above flaws. For example, fully automated scanner that looks for 25 common UNIX vulnerabilities, runs hundreds of attack threads simultaneously and deploys a rootkit upon the system compromise is one such tool. The software can be set to choose a random A class (16 million hosts) and first scan it for a particular network service. Then, on second pass the program collects FTP banners (such as "ftp.example.com FTP server (Version wu-2.6.1-16) ready.") for target selection. On third pass, the servers, that had misfortune of running a particular vulnerable version of the FTP daemon, are attacked, exploited and backdoored for convenience. The owner of such tool can return in the morning to pick up a list of IP addresses that he now "owns" (meaning, has privileged access to). In addition, malicious attackers are known to compile Internet-wide databases of available network services complete with their versions so that the hosts can be compromised quickly after the new software flaw is discovered. In fact, there is always a race between various groups to take over more systems. This advantage can come handy in case of a local denial-of-service war. While "our" attackers have not tried to draft the honeypot in their army of "zombie" bots, they did use it to launch old-fashioned point-to-point denial of 1 Samba is a Linux/UNIX implementation of a Microsoft Server Message Block (SMB) protocol
  6. 6. service attacks (such as UDP and ping floods and even the ancient modem hang-up ATH DoS). The attacker's behavior seemed to indicate that they are used to operate with no resistance. One attacker's first action was changing the 'root' password on the system - clearly, an action that will be noticed the next time system admin tries to log in. Not a single attacker bothered to check for the presence of Tripwire integrity checking system, which is included by default in many Linux distributions. On the next Tripwire run, all the "hidden" files are easily discovered. One more attacker has created a directory for himself as "/his-hacker-handle", something that every system admin worth his/her salt will see at once. The rootkits (i.e. hacker toolkits to maintain access to a system that include backdoors, Trojans and common attack tools) now reach megabyte sizes, and feature graphical installation interfaces, suitable for novice blackhats. Research indicates that some of the script kiddies "own" networks consisting of hundreds of machines that can be used for DoS or other malicious purposes. The exposed UNIX system is most often scanned for ports 111 (RPC services), 139 (SMB), 443 (OpenSSL) and 21 (FTP). Recent (2001-2003) remote "root" bugs in those services account for this phenomenon. The system with vulnerable Apache with SSL is compromised within several days. Another benefit of running a honeypot is a better handle on the Internet noise. Clearly, security professionals who run Internet-exposed networks are well aware of the common Internet noise (such as CodeRed, SQL, MSRPC worms, warez site FTP scans, etc). Honeypot allows one to observe the minor oscillations of such noise. Sometimes, such changes are meaningful. In the recent case of MS SQL worm, we detected a sharp increase of TCP port 1433 access attempts, just before the news of the worm became public. The same spike was seen when the RPC worms were released. The number of hits was similar to a well-researched CodeRed growth pattern. Thus, we concluded that a new worm was out. Additional value of the honeypot is in its use as a security training platform. Using the honeypot the company can bring up the level of incident response skills of the security team. Honeypots incidents can be investigated and then the answers verified by honeypot's enhanced data collection capabilities. 'What tool was used to attack?' - here it is, on the captured hard drive or extracted from network traffic. 'What did they want?' - look at their shell command history and know. One can quickly and effectively develop network and disk forensics skills, attacker tracking, log analysis, IDS tuning and many other critical security skills in the controlled but realistic environment of the honeypot. More advanced research uses of the honeypot include hacker profiling and tracking, statistical and anomaly analysis of incoming probes, capture of worms and analysis of malicious code development. By adding some valuable resources (such as e-commerce systems and billing databases) and using the covert intelligence techniques to lure attackers in, more sophisticated attackers can be attracted and studied. That will increase the operating risks.
  7. 7. IV. Abuse of the compromised systems The more recent OpenSSL incidents are more interesting since the attacker does not have root upon breaking into the system (such as, user "apache"). One might think that owning a system with no "root" access is useless, but we usually see active system use in these cases. Here are some of the things that such non-root attackers do on such compromised systems: 1."IRC till you drop" Installing an IRC bot or bouncer is a popular choice of such attackers. Several IRC channels dedicated entirely for communication of the servers compromised by a particular group were observed on several occasions. Running an IRC bot does not require additional privileges. 2."Local exploit bonanza" Throwing everything they have at the Holy Grail of root access seems common as well. Often, the attacker will try half a dozen different exploits trying to elevate his privileges from mere "apache" to "root". 3. "Evil daemon" A secure shell daemon can be launched by a non-root user on a high numbered port. This was observed in several cases. In some of these cases, the intruder accepted the fact that he will not have root. He then started to make his new home on the net more comfortable by adding a backdoor and some other tools in "hidden" (".. " and other non printable names are common) directories in /tmp or /var/tmp. 4. "Flood, flood, flood" While spoofed DoS is more stealthy and harder to trace, many of the classic DoS attacks do not require root access. For example, ping floods and UDP floods can be initiated by non-root users. This capability is sometimes abused by the intruders, using the fact that even when the attack is traced the only found source would be a compromised machine with no logs present. 5. "More boxes!" Similar to a root-owning intruder, those with non-root shells may use the compromised system for vulnerability scanning and widespread exploitation. Many of the scanners, such as openssl autorooter, recently discovered by us, do not need root to operate, but is still
  8. 8. capable of discovering and exploiting a massive (thousands and more) system within a short time period. Such large networks can be used for devastating denial of service attacks (for example, such as recently warned by CERT). V. Conclusion As a conclusion, we will try to answer the question: "Should you do it?" The precise answer depends upon your organization mission and available security expertise. Again, the emphasis here is on research honeypots and not on "shield" or protection honeypots. If your organization took care of most routine security concerns, has a developed in-house security program (since calling an outside consultant to investigate your honeypot incident does not qualify as a wise investment) and requires a first hand knowledge of attacker techniques and last minute Internet threats - the answer tends towards a tentative "yes". Major security vendors and consultancies or Universities with advanced computer security programs might fall into the category. If you are not happy with your existing security infrastructure and want to replace or supplement it with the new cutting edge "honeypot technology" - the answer is a resounding "no". Research honeypots will not *directly* impact the safety of your organization. Moreover, honeypots have their inherent dangers. They are analyzed in papers posted on the Honeynet Project site. The dangers include uncertain liability status, possible hacker retaliation and others. ABOUT THE AUTHOR: This is an updated author bio, added to the paper at the time of reposting in 2009. Dr. Anton Chuvakin (http://www.chuvakin.org) is a recognized security expert in the field of log management and PCI DSS compliance. He is an author of books "Security Warrior" and "PCI Compliance" and a contributor to "Know Your Enemy II", "Information Security Management Handbook" and others. Anton has published dozens of papers on log management, correlation, data analysis, PCI DSS, security management (see list www.info-secure.org) . His blog http://www.securitywarrior.org is one of the most popular in the industry. In addition, Anton teaches classes and presents at many security conferences across the world; he recently addressed audiences in United States, UK, Singapore, Spain, Russia and other countries. He works on emerging security standards and serves on the advisory boards of several security start-ups. Currently, Anton is developing his security consulting practice, focusing on logging and PCI DSS compliance for security vendors and Fortune 500 organizations. Dr. Anton Chuvakin was formerly a Director of PCI Compliance Solutions at Qualys. Previously, Anton worked at LogLogic as a Chief Logging Evangelist, tasked with educating the world about the importance of logging for security, compliance and operations. Before LogLogic, Anton was employed by a security vendor in a strategic product management role. Anton earned his Ph.D. degree from Stony Brook University.