PPT

451 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
451
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

PPT

  1. 1. A Machine Learning-based Approach for Estimating Available Bandwidth Ling-Jyh Chen 1 , Cheng-Fu Chou 2 and Bo-Chun Wang 2 1 Academia Sinica 2 National Taiwan University
  2. 2. <ul><li>Link Capacity : maximum IP-layer throughput that a flow can get, without any cross traffic. </li></ul><ul><li>Available Bandwidth : maximum IP-layer throughput that a flow can get, given (stationary) cross traffic. </li></ul>Definition
  3. 3. <ul><li>Statistical cross-traffic models: </li></ul><ul><ul><li>Measure the time interval between the arrival of any two successive probe packets at the receiver and use the dispersion measurements to estimate the available bandwidth. </li></ul></ul><ul><ul><li>E.g., Delphi, IGI, Spruce </li></ul></ul><ul><li>Self-induced congestion models: </li></ul><ul><ul><li>Based on the intuition that if the probing rate is lower than the available bandwidth, the probe packets will not experience additional queueing delay during transmission. </li></ul></ul><ul><ul><li>E.g., TOPP, Pathload, pathChirp </li></ul></ul><ul><li>However, these approaches are either inaccurate or intrusive. </li></ul>Related Work
  4. 4. <ul><li>We propose a machine learning-based approach for accurate available bandwidth estimation. </li></ul><ul><li>The proposed approach can estimate available bandwidth even if there are no sample with similar properties to the measured path in the training dataset. </li></ul><ul><li>Using a set of simulations, we show the proposed approach is fast, accurate and non-intrusive. </li></ul>Our Contribution
  5. 5. <ul><li>The fact: </li></ul><ul><ul><li>Due to the diversity and dynamics of the Internet, collecting and verifying the correctness of data of such a large-scale network is hard. </li></ul></ul><ul><li>Our ideas: </li></ul><ul><ul><li>Create a representative network in the network simulator using realistic network traces with well-established network traffic models. </li></ul></ul><ul><ul><li>Probe the network using effective probing model and collect training data for the machine learning tool. </li></ul></ul><ul><ul><li>Estimate the available bandwidth using the well-trained machine learning tool. </li></ul></ul>Basic Ideas
  6. 6. <ul><li>Network Scenarios </li></ul><ul><ul><li>Topology: Tiscali topology of Rocketfuel’s trace [13] </li></ul></ul><ul><ul><ul><li>750 links and 506 nodes (221 are end-users) </li></ul></ul></ul><ul><ul><ul><li>We assume </li></ul></ul></ul><ul><ul><ul><ul><li>Propagation delay: uniformly distributed between [10,20] ms </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Buffer size of each link is 20 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>50% of end-users are ADSL (3M/1Mbps), and the remainder are academic networks (100Mbps) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Core router links are 1Gbps </li></ul></ul></ul></ul>System Settings
  7. 7. <ul><li>Network Scenarios </li></ul><ul><ul><li>Traffic: based on measurement results </li></ul></ul><ul><ul><li>[7] T. Karagiannis, K. Papagiannaki, and M. Faloutsos. Blinc: multilevel traffic classification in the dark. In ACM SIGCOMM , 2005. </li></ul></ul>System Settings
  8. 8. <ul><li>Network Scenarios </li></ul><ul><ul><li>Traffic: based on measurement results </li></ul></ul><ul><ul><li>[12] M. Roughan, S. Sen, O. Spatscheck, and N. Duffield. Class-of-service mapping for qos: a statistical signature-based approach to ip traffic classification. In ACM IMC , 2004. </li></ul></ul>System Settings
  9. 9. <ul><li>Probing Models </li></ul><ul><ul><li>Packet Train model </li></ul></ul><ul><ul><ul><li>Send k packets in a burst (back-to-back) </li></ul></ul></ul><ul><ul><ul><li>k-1 dispersions observed </li></ul></ul></ul><ul><ul><ul><li>k = 11 </li></ul></ul></ul>System Settings
  10. 10. <ul><li>Probing Models </li></ul><ul><ul><li>pathChirp-like model </li></ul></ul><ul><ul><ul><li>Send a chirp of fifteen packets each time </li></ul></ul></ul><ul><ul><ul><li>The lowest sending rate is five percent of the bottleneck capacity </li></ul></ul></ul><ul><ul><ul><li>Spread factor γ =1.2 </li></ul></ul></ul>System Settings
  11. 11. <ul><li>Machine Learning Tools </li></ul><ul><ul><li>Unsupervised learning: EM, K-means clustering </li></ul></ul><ul><ul><li>Supervised learning: k-NN, SVM </li></ul></ul><ul><li>We use SVM in this study, because </li></ul><ul><ul><li>It can handle missing data caused by packet loss. </li></ul></ul><ul><ul><li>It can interpolate/extrapolate the system output. </li></ul></ul><ul><ul><li>The computation overhead is affordable. </li></ul></ul>System Settings
  12. 12. <ul><li>Packet Train model </li></ul><ul><li>pathChirp-like model </li></ul><ul><li>Comparison with other tools </li></ul><ul><li>Scale-Free approach </li></ul>Performance Evaluation
  13. 13. <ul><li>Each sample is the probing results of a randomly selected node pair. </li></ul><ul><li>16,000 samples as the training data. </li></ul><ul><li>1,500 samples as the test data. </li></ul><ul><li>Each sample of the training data is comprised of 13 properties: 10 dispersions, hop count, bottleneck capacity, and tightest link’s available bandwidth. </li></ul><ul><li>Each sample of the test data contains all above information, except the available bandwidth. </li></ul>Evaluation: Packet Train Model
  14. 14. <ul><li>The results are divided into three groups based on their bottleneck link capacity. </li></ul>Evaluation: Packet Train Model
  15. 15. <ul><li>Each chirp consists of 15 packets with a spread factor γ =1.2. </li></ul><ul><li>16,000 samples as the training data. </li></ul><ul><li>1,500 samples as the test data. </li></ul><ul><li>Each sample of the training data is comprised of 17 properties: 14 dispersions, hop count, bottleneck capacity, and tightest link’s available bandwidth. </li></ul><ul><li>Each sample of the test data contains all above information, except the available bandwidth. </li></ul>Evaluation: pathChirp-like Model
  16. 16. <ul><li>The results are divided into two groups based on their bottleneck link capacity. </li></ul>Evaluation: pathChirp-like Model
  17. 17. <ul><li>Compare the proposed approach using the pathChirp-like model with pathChirp and Spruce. </li></ul><ul><li>Run 1,500 tests for both pathChirp and Spruce in the same network scenario. </li></ul>Comparing with Other Tools
  18. 18. <ul><li>The results are divided into two groups based on their bottleneck link capacity. </li></ul>Comparing with Other Tools
  19. 19. Scale-Free Approach <ul><li>The proposed approach collects training data from a very limited network scenario. </li></ul><ul><li>The cost of building a database covering all types of Internet scenario is prohibitively expensive. </li></ul><ul><li>We propose a Scale-Free approach to normalize all properties in our system. </li></ul><ul><ul><li>The dispersion measurements are divided b the initial inter-packet gap. </li></ul></ul><ul><ul><li>The observed available bandwidth is replaced with the utilization of the bottleneck link. </li></ul></ul>
  20. 20. Scale-Free Approach <ul><li>Two scenarios (6 Mbps and 50Mbps of the bottleneck link capacity) </li></ul><ul><li>1,500 samples for each case </li></ul>
  21. 21. <ul><li>We propose a machine learning-based approach for estimating the available bandwidth of a network path. </li></ul><ul><li>We show that the pathChirp-like model outperforms the packet train model in all test cases. </li></ul><ul><li>By normalizing all attributes, we show this novel approach is able to accurately estimate available bandwidth, even if there are no sample with similar properties to the measured path in the training dataset. </li></ul>Conclusion
  22. 22. Thanks! http://www.iis.sinica.edu.tw/~cclljj/ http://nrl.iis.sinica.edu.tw/

×