PeerShark - Detecting Peer-to-Peer Botnets by Tracking Conversations
1. DETECTING PEER-TO-PEER
BOTNETS BY TRACKING
CONVERSATIONS
Pratik Narang1, Subhajit Ray1, Chittaranjan Hota1 and Venkat Venkatakrishnan2
1BITS Pilani, Hyderabad campus, India
2University of Illinois at Chicago
6. Previous work
• Intial work with signature-based approaches
• Evaded by bots using encryption
• Recent work – analysis of network behavior
• Most of it uses 5-tuple ‘flow-based’ approach
<Source IP, Dest. IP, Source port, Dest. Port, Protocol>
• Great success in Internet traffic classification
• Doesn’t suit the needs of P2P traffic
7. Identifying P2P traffic
• Modern P2P apps and bots randomize ports, operate on
TCP as well as UDP
• P2P traffic has bi-directional nature
• E.g.- BitTorrent- seeders and leechers
• Thus, traditional flow-based approaches may give a false
view of network communication
• Notion of a conversation more suited to P2P
• Who is talking to whom ?
• Irrespective of protocol, port, etc.
8. P2P apps v/s P2P bots
Applications:
• A human user-‘bursty’
traffic
• High volume of data
transfers seen
• Small inter-arrival time of
packets seen in apps
Botnets:
• Automated/scripted
commands
• Low in volume,
high in duration
• Large inter-arrival time of
packets seen in stealthy
bots
10. Approach
• Parse network traces, discard corrupted packets
• Create ‘conversations’, identified by the tuple <IP1,IP2> and
an initial FLOWGAP parameter
• Aggregate conversations again – this time with a higher
FLOWGAP parameter
• To be decided by Network Admin based on understanding of the
network
• Useful for detecting slow and stealthy bots
11. Approach
• For each tuple, extract 4 features :
– The duration of the conversation
– The number of packets exchanged in the conversation
– The volume of the conversation (no. of bytes)
– The Median value of the inter-arrival time of packets in the conversation
• Hunt for long-lived, stealthy conversations
• Categorize P2P apps & bots with the features
above, using supervised machine learning
approaches
12. Dataset
P2P app name Used for? Type of data/Size of data
eMule P2P file sharing application pcap file/19 GB
uTorrent P2P file sharing application pcap file/33 GB
P2P botnet name What it does? Type of data/Size of data
Storm Email Spam pcap file/ 4.8 GB
Waledac Email spam, password stealing pcap file/ 1.1 GB
17. Limitations & Possible evasions of
PeerShark
• Only built for 2 apps and 2 bots. Any new app/bot will also
get (mis)classified into one of these classes.
• If more than one P2P application (benign or malicious) is
running between two peers, PeerShark will not be able to
correctly classify it.
• Smarter bots which engage in occasional file-sharing with
bot-peers (and thus mimic benign behavior) can evade
PeerShark.