slides

423 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
423
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

slides

  1. 1. SMARTxAC: A Passive Monitoring and Analysis System for High-Speed Networks TERENA Networking Conference 2006 Pere Barlet-Ros Josep Solé-Pareta Javier Barrantes Eva Codina Jordi Domingo-Pascual {pbarlet, pareta, jbarranp, ecodina, jordid}@ac.upc.edu http://www.ccaba.upc.edu/smartxac Acknowledgment: This work has been partially supported by CESCA (SMAR TxAC agreement) and the Spanish MEC (ref. TSI2005-07520-C03-02)
  2. 2. SMARTxAC <ul><li>SMARTxAC: Traffic Monitoring and Analysis System for the Anella Científica </li></ul><ul><ul><li>Operative since July 2003 </li></ul></ul><ul><ul><li>Developed under a collaboration agreement CESCA-UPC </li></ul></ul><ul><ul><li>Tailor-made traffic monitoring system for the Anella Científica </li></ul></ul><ul><li>Main objectives </li></ul><ul><ul><li>Low-cost platform </li></ul></ul><ul><ul><li>Continuous monitoring of high-speed links without packet loss </li></ul></ul><ul><ul><li>Detection of network anomalies and irregular usage </li></ul></ul><ul><ul><li>Multi-user system: Network operators and Institutions </li></ul></ul><ul><li>Measurement of two full-duplex GigE links </li></ul><ul><ul><li>Connection between Anella Científica and RedIRIS </li></ul></ul><ul><ul><li>Current load: ≈ 1.5 Gbps / ≈ 270 Kpps </li></ul></ul>
  3. 3. Anella Científica Measurement point 2 x GigE full-duplex
  4. 4. Daily Network Usage
  5. 5. System Architecture <ul><li>Monitoring high-speed links is challenging </li></ul><ul><ul><li>Collection of Gbps and storage of Terabytes of data per day </li></ul></ul><ul><ul><li>Limitations of current technology </li></ul></ul><ul><ul><ul><li>CPU power, memory access speeds, bus and disk bandwidth, storage capacity, etc. </li></ul></ul></ul><ul><li>Tailor-made system divided according to real-time constraints and running on different computers </li></ul><ul><ul><li>Capture System (severe real-time constraints) </li></ul></ul><ul><ul><li>Traffic Analysis System (soft real-time constraints) </li></ul></ul><ul><ul><li>Result Visualization System (user driven) </li></ul></ul><ul><li>Data reduction: Early discard unnecessary information </li></ul><ul><ul><li>Improve performance </li></ul></ul><ul><ul><li>Reduce storage requirements </li></ul></ul>
  6. 6. Measurement Scenario dag0 dag1 REDIRIS Other Regional Nodes ESPANIX GÉANT Capture System (DAG 4.3GE + GPS) Traffic Analysis System (Linux) Result Visualization System Private network 2 Gbps 2 Gbps CISCO 6513 (Anella Científica) Juniper M-20 (RedIRIS) RedIRIS (Madrid) Internet Connection 2 x 2Gbps ANELLA CIENTÍFICA RedIRIS Global Internet Management network
  7. 7. Capture System <ul><li>Capture hardware </li></ul><ul><ul><li>Intel Xeon 2.4 GHz. + 1 GB. RAM </li></ul></ul><ul><ul><li>2 x Endace DAG 4.3GE </li></ul></ul><ul><ul><li>4 x Optical splitters </li></ul></ul><ul><ul><li>Precise timestamping using GPS (Trimble Acutime 2000) </li></ul></ul><ul><li>Capture software </li></ul><ul><ul><li>Multi-threaded implementation </li></ul></ul><ul><ul><li>Collection of packet-headers without loss (no sampling) </li></ul></ul><ul><ul><li>5-tuple flow aggregation </li></ul></ul><ul><ul><li>Aggregated flows are sent to the Analysis System </li></ul></ul><ul><li>Data Reduction </li></ul><ul><ul><li>Header collection: ≈ 1:10 (90 GB/min  9 GB/min) </li></ul></ul><ul><ul><li>Flow aggregation: ≈ 1:200 (45 GB/5 min  200 MB/5 min) </li></ul></ul><ul><ul><li>Some data is kept to analyze anomalies (window of ≈ 20 GB.) </li></ul></ul>
  8. 8. Measurement Scenario dag0 dag1 REDIRIS Other Regional Nodes ESPANIX GÉANT Capture System (DAG 4.3GE + GPS) Traffic Analysis System Result Visualization System Private network 2 Gbps 2 Gbps CISCO 6513 (Anella Científica) Juniper M-20 (RedIRIS) RedIRIS (Madrid) Internet Connection 2 x 2Gbps ANELLA CIENTÍFICA RedIRIS Global Internet Management network
  9. 9. Traffic Analysis System <ul><li>Analysis hardware </li></ul><ul><ul><li>Pentium IV 2.6 GHz. + 1 GB. RAM </li></ul></ul><ul><li>Analysis Software </li></ul><ul><ul><li>Aggregation of 5-tuple flows into classified flows </li></ul></ul><ul><ul><ul><li><srcIP, dstIP, srcPort, dstPort, proto>  <origin, dest., app> </li></ul></ul></ul><ul><ul><ul><li>Origins: Institutions (also Network access points) </li></ul></ul></ul><ul><ul><ul><li>Destinations: External networks RedIRIS is connected to </li></ul></ul></ul><ul><ul><ul><li>Bidirectional aggregation </li></ul></ul></ul><ul><ul><li>This classification can be useful for charging/cost-sharing </li></ul></ul><ul><li>Data reduction </li></ul><ul><ul><li>Classified flows: >1:1000 (≈ 60 GB/day  ≈ 50 MB/day) </li></ul></ul><ul><ul><li>Compared with header traces: > 1:250000 (≈ 13 TB/day) </li></ul></ul>
  10. 10. Measurement Scenario dag0 dag1 REDIRIS Other Regional Nodes ESPANIX GÉANT Capture System (DAG 4.3GE + GPS) Traffic Analysis System Result Visualization System Private network 2 Gbps 2 Gbps CISCO 6513 (Anella Científica) Juniper M-20 (RedIRIS) RedIRIS (Madrid) Internet Connection 2 x 2Gbps ANELLA CIENTÍFICA RedIRIS Global Internet Management network
  11. 11. Result Visualization System <ul><li>Hardware </li></ul><ul><ul><li>Pentium III 450 MHz. </li></ul></ul><ul><li>Software </li></ul><ul><ul><li>Web-based graphical interface </li></ul></ul><ul><ul><li>Institutions only have access to their own statistics </li></ul></ul><ul><ul><li>Graphs are generated on demand </li></ul></ul><ul><li>Available graphs </li></ul><ul><ul><li>More than 300 combinations of graphs per institution and day </li></ul></ul><ul><ul><li>Statistics are updated every 5 minutes </li></ul></ul><ul><ul><li>Also weekly, monthly and yearly reports </li></ul></ul>
  12. 12. Use case 1: Port Scanning <ul><li>Traffic profile per application (bps) </li></ul>
  13. 13. Use case 1: Port Scanning <ul><li>Traffic profile per application (flows/s) </li></ul>
  14. 14. Use case 1: Port Scanning <ul><li>Destination port: MySQL (tcp/3306) </li></ul>DST PORT SRC PORT DST IP SRC IP 3306 2239 E.F.16.126 A.B.45.75 3306 2833 C.D.123.168 A.B.44.149 3306 2201 E.F.73.115 A.B.45.75 3306 2667 C.D.151.228 A.B.44.149 3306 4415 E.F.24.241 A.B.45.75 3306 3212 E.F.63.23 A.B.45.75 3306 1719 C.D.220.116 A.B.44.149 3306 2672 E.F.46.180 A.B.45.75 3306 1891 C.D.192.56 A.B.44.149 3306 3353 C.D.183.124 A.B.44.149 3306 3525 C.D.155.64 A.B.44.149 3306 3694 C.D.127.4 A.B.44.149 3306 1907 C.D.206.188 A.B.44.149 3306 2526 E.F.60.108 A.B.45.75 3306 2153 C.D.120.253 A.B.44.149
  15. 15. Use case 2: Warez Server <ul><li>Traffic profile per application (bps) </li></ul>
  16. 16. Use case 2: Warez Server <ul><li>Top-10 (bytes) </li></ul>
  17. 17. Use case 3: Denial-of-Service <ul><li>Traffic profile per application (bps) </li></ul>
  18. 18. Anomaly Detection <ul><li>Threshold-based anomaly detection </li></ul><ul><ul><li>An upper and lower traffic threshold can be set per institution </li></ul></ul><ul><ul><li>Thresholds: bits/sec, packets/sec and flows/sec </li></ul></ul><ul><ul><li>Different intervals: day/night and workday/weekend </li></ul></ul><ul><ul><li>Once an anomaly is detected additional information is kept </li></ul></ul><ul><ul><ul><li>Additional information can be reviewed later offline </li></ul></ul></ul><ul><li>Profile-based anomaly detection (work in progress) </li></ul><ul><ul><li>Time-series prediction (adaptive linear filter) </li></ul></ul><ul><ul><li>It is not needed to know the “ordinary” traffic profile </li></ul></ul><ul><ul><li>Anomalies are detected when actual traffic differs from its predicted value </li></ul></ul><ul><ul><li>Thresholds mitigate limitations of adaptive prediction with long-term anomalies </li></ul></ul>
  19. 19. Identification of Network Applications <ul><li>Traffic classification in SMARTxAC is based on port numbers </li></ul><ul><ul><li>Port-based classification is no longer reliable </li></ul></ul><ul><ul><li>P2P, dynamic ports, tunnelling, web-based services, … </li></ul></ul><ul><li>We are developing a classification method based on machine learning techniques </li></ul><ul><ul><li>It learns features of traffic flows that identify a given application </li></ul></ul><ul><ul><li>Packet payloads are only needed in the training phase </li></ul></ul><ul><ul><li>Once the system is trained only packet headers are needed </li></ul></ul>
  20. 20. Preliminary Results (Accuracy)
  21. 21. Port-based vs. Machine Learning <ul><li>Port-based Machine learning </li></ul>
  22. 22. Conclusions <ul><li>SMARTxAC is a tailor-made network monitoring system that </li></ul><ul><ul><li>Operates at gigabit speeds without packet loss </li></ul></ul><ul><ul><li>It is relatively low-cost </li></ul></ul><ul><ul><li>Provides very detailed information about the network usage </li></ul></ul><ul><ul><li>Multi-user system: network operators and institutions </li></ul></ul><ul><li>Since 2003, SMARTxAC is daily used by CESCA to detect anomalies, attacks, performance problems, network faults, etc. </li></ul><ul><li>Future work </li></ul><ul><ul><li>Anomaly detection and application identification </li></ul></ul><ul><ul><li>Sampling, IPv6 support, … </li></ul></ul><ul><ul><li>Deployment of more measurement points in the Anella Científica </li></ul></ul><ul><ul><li>Release the source code under an open-source license </li></ul></ul><ul><ul><li>Collaboration with Intel’s CoMo: http://como.intel-research.net </li></ul></ul>
  23. 23. SMARTxAC: A Passive Monitoring and Analysis System for High-Speed Networks TERENA Networking Conference 2006 Pere Barlet-Ros Josep Solé-Pareta Javier Barrantes Eva Codina Jordi Domingo-Pascual {pbarlet, pareta, jbarranp, ecodina, jordid}@ac.upc.edu http://www.ccaba.upc.edu/smartxac Acknowledgment: This work has been partially supported by CESCA (SMAR TxAC agreement) and the Spanish MEC (ref. TSI2005-07520-C03-02)

×