Peer-to-Peer Application Recognition Based on Signaling Activity


Published on

Because of the enormous growth in the number of peer-to-peer (P2P) applications in recent years, P2P traffic now constitutes a substantial proportion of Internet traffic. The ability to accurately identify different P2P applications from the network traffic is essential for managing a number of network traffic issues, such as service differentiation and capacity planning. However, modern P2P applications often use proprietary protocols,dynamic port numbers, and packet encryptions, which make traditional identification approaches like port-based or signaturebased identification less effective.

In this paper, we propose an approach for accurately recognizing P2P applications running on monitored hosts based on signaling behavior, which is regulated by the underlying P2P protocol; therefore, each application possesses a distinguishing characteristic. We consider that the signaling behavior of each P2P application can serve as a unique signature for application identification. Our approach is particularly useful for three reasons: 1) it does not need to access the packet payload; 2) it recognizes applications based purely on their signaling behavior; and 3) it can identify particular P2P applications. The performance evaluation shows that 92% of a real-life traffic trace can be correctly recognized within a 5-minute monitoring period.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Peer-to-Peer Application Recognition Based on Signaling Activity

  1. 1. Chen‐Chi Wu1, Kuan‐Ta Chen2, Yu‐Chun Chang1, Chin‐Laung Lei1 1Department of Electrical Engineering, National Taiwan University 2Institute of Information Science, Academia Sinica ICC09 1
  2. 2. Talk Outline Introduction Fundamentals of our scheme Methodology Performance evaluation Conclusion ICC09 2
  3. 3. Introduction P2P traffic constitutes a substantial volume of Internet  traffic Accurately identify P2P applications from the network  traffic is important Network management, capacity planning, etc. Conventional approaches: port numbers or payload  signatures Dynamic ports, encrypted payload ICC09 3
  4. 4. Fundamentals of Our Scheme P2P applications generate two types of traffic Data transfer traffic File‐sharing or file‐redistribution Signaling traffic File information refreshment, peer discovery, control information  exchange, etc. Signaling activity is regulated by the underlying P2P  protocol Each P2P application may have a unique characteristic ICC09 4
  5. 5. Fundamentals of Our Scheme Verify our conjecture Compare the signaling activity patterns of BitTorrent,  eMule, and Skype Traffic data Capture the traffic of 3 hosts that execute BitTorrent,  eMule, or Skype Assume packets with payload size smaller than 100 bytes  are signaling packets ICC09 5
  6. 6. Signaling Activity Patterns Assign id to hosts that were contacted by the monitored  host based on the order in which they are observed BitTorrent Intensive exchange of signaling packets The BitTorrent client progressively discovers new hosts ICC09 6
  7. 7. Signaling Activity Patterns eMule The number of hosts increases rapidly in the first 10  minutes but increases slowly thereafter Skype Most of signaling packets belong to the probe traffic ICC09 7
  8. 8. Proposed Scheme Identify P2P applications running on hosts based on the  signaling behavior How to characterize signaling traffic? ICC09 8
  9. 9. Signaling Behavior Characterization Keep track of signaling packets of a monitored host for a  period of time Count the number of hosts contacted and the number of  packets sent and received every minute Classify hosts contacted with the monitored host into 2 types Sending/receiving packets within 5 minutes => old host Otherwise => new host Characterize the signaling behavior on two levels Host level: based on the number of new or old hosts Message level: based on the number of new or old packets ICC09 9
  10. 10. Signaling Behavior Features Host level Ratio of new / old hosts Growth rate of new / old hosts Correlation coefficient between the number of new and old hosts Message level Ratio of new / old packets Growth rate of new / old packets Correlation coefficient between the number of new and old packets ICC09 10
  11. 11. Example Host level ‐ ratio of new hosts Keep track of hosts contacted with the monitored host Incoming direction in the 6th min.: B and D are old hosts; A,  G, and H are new hosts Ratio of new hosts in the 6th min. => 3/5 Old host New host Incoming Outgoing Direction Direction A B CD BED C B AE B FG A A BC BCD BD CBE D B F B GADH 1 2 3 4 5 6 Monitor time (min.) ICC09 11
  12. 12. Identifier Design Adopt support vector machine (SVM) Training phase Derive features from each training data  Label each training data with the name of P2P applications Train the SVM classifier Identification phase Derive features from a signaling packet stream Use the trained classifier to determine the P2P application ICC09 12
  13. 13. Traffic Data Category Hosts Packets BitTorrent 110,711 104,722,150 eMule 42,377 36,716,588 Skype 61,777 34,076,328 World of Warcraft 218 2,528,359 TELNET 362 21,118,522 HTTP 4,448 28,264,360 ICC09 13
  14. 14. Performance Evaluation 10‐fold cross validation ICC09 14
  15. 15. Conclusion Summary Identify distinct P2P applications without examining  payload Characterize signaling behavior possessed by P2P  applications Future work Consider the case that a host launches multiple P2P  applications Short flows? ICC09 15
  16. 16. Thank you for your attention! ICC09 16