On Evaluating Policy-Based Bandwidth Management Devices



                                           Huan-Yun Wei1       ...
1. Introduction


      Internet services provide an economic and convenient system to carry out business, such as
efficie...
Queuing the Internet Traffic: TCP vs. UDP

      The majority of software applications today use TCP (Transmission Control...
1.     Window Sizing: Since a TCP connection can be actively controlled through the feedback Acks, the
       window-sizin...
NetReality’s WiseWan            5Mbps             4.0       Proprietary,       WAN link Flash   P     32M      V.35       ...
which FTP-data port (port 20, for sending data) can be dynamically changed to another by negotiation in
the FTP-Cmd port (...
3.1 Testbed: Mimics the Real-Life Internet


         Internet is very dynamic. Different connections have different paths...
Self-written AWK Data Analyzer          Calculating statistics from the tcpdump result.                       G
scripts [2...
following statistics.
     Statistic                   Quantify what?                                              Definit...
C. Advanced Test

      This test includes bandwidth borrowing test and VoIP quality test. Bandwidth borrowing has been
de...
have the most accurate and stable control for each class; WiseWAN and FloodGate are less effective in
 the narrowband clas...
Figure 3: Results of fairness and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)


A-3. Retransmis...
Figure 4: Test results of retransmission ratio (A1: No Internet Delay; A2: With Internet Delay)
B. Robustness Test Results...
IPolicer
                                                                                                FloodGate
       ...
bandwidth of connection 2 (yyyy/tcp line). The test crew draws another baseline indicating the ideal
total link bandwidth ...
(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard Guardian Pro                                            ...
exercises Packet Size Optimization (minifying the maximum transfer unit of FTP connections when
establishing the connectio...
sizes of all other traffic when using a narrowband (125kbps) access link. Detailed functionality
comparison among the DUTs...
[21] W. R. Stevens, TCP/IP Illustrated Volume 1 - The Protocols, Addison-Wesley, 1994.
[22] Tolly Group, http://www.tolly....
Sitara’s QoSWorks Web Browser Per-device   Single   Web Server Web Client   Per-class Bandwidth Usage/Link statistics/Top ...
Appendix C. Intuitive Example for Basic Test Statistics


      This intuitive example illustrates how the Basic Test stat...
Upcoming SlideShare
Loading in …5
×

doc

768 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
768
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

doc

  1. 1. On Evaluating Policy-Based Bandwidth Management Devices Huan-Yun Wei1 Ying-Dar Lin Department of Computer and Information Science National Chiao Tung University, Hsinchu, Taiwan Tel: +886-3-5712121-ext56667 FAX: +886-3-5721490 Email: {hywei,ydlin}@cis.nctu.edu.tw Policy-based bandwidth management defines how to allocate bandwidth resources according to organizational policy rules. Enterprises often employ such policy-based devices at their organizational edges to manage the narrow but expensive Internet access links. This work designs a novel testbed and uses it to evaluate the functionality and performance of many such devices, including six commercial products and one open source solution. Their policy rules can be categorized into (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes. The testbed mimics the real-life Internet with heterogeneous Internet delays/delay jitters/packet loss rates, and evaluates the effectiveness of policy enforcement of the above three policy types in terms of accuracy, fairness, stability, robustness, bandwidth borrowing, and voice over IP (VoIP) quality. The test results2 reveal that (1) explicitly sizing the TCP window could cause performance or fairness degradation even under slight packet loss rates; (2) the open source solution can compete with commercial products in accurately limiting flow aggregates; (3) the voice qualities over IP networks significantly depends on the packet sizes of all other traffic when using a narrowband (125kbps) access link. Keywords: policy-based, bandwidth management, TCP, testbed, emulator 1 Corresponding author 2 All test results are verified by the vendors and are reproducible through our open tools. Nowadays most benchmark reports are financed by vendors and may be biased, without practical testbeds. Guided by this neutral test, readers can obtain in-depth sights when examining bandwidth management devices. 1
  2. 2. 1. Introduction Internet services provide an economic and convenient system to carry out business, such as efficient information exchange among branch offices, or efficient customer/provider access to the services. However, the importance of the services varies, and enterprises often fail to effectively utilize the narrow but expensive WAN link bandwidth. For instance, the bandwidth required by ERP (Enterprise Resource Planning), voice over IP (VoIP), and e-business may be occupied by less-important applications such as FTP. Since end-to-end Internet QoS such as DiffServ [1] is still under experiment, enterprises seek to at least manage their inbound and outbound links. Thus, policy-based bandwidth management devices are employed at organizational edges to set and enforce organizational policies for pursuing the utmost benefits. Network administrators define policy rules to achieve resource management objectives for the enterprise. Each policy rule contains “condition” and “action” fields to define specific actions for specific conditions. Condition defines the packet-matching criteria, such as a certain subnet, application, or protocol. Action defines the bandwidth parameters, such as “at least 100kbps” or “at most 200kbps”. So each policy rule is class-based that it groups a set of traffic flows into a per-class queue according to the specified packet filter (condition), and then the class of traffic is scheduled out at its corresponding specified bandwidth (action). Moreover, the class-based rules can be further configured with bandwidth borrowing among the classes to dynamically utilize available bandwidth effectively. Additionally, each connection within a class can be guaranteed to have at least a certain amount of bandwidth. Throughout this work we evaluate the effectiveness of various policy enforcements for the above three policy types: (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes. The following subsections review traditional and prevalent technologies to enforce these policy rules. Traditional Technology—Queuing A straightforward method for bandwidth management is to queue less-important traffic and pass important traffic as soon as possible. Queuing can be roughly categorized into (1) priority-based queuing and (2) rate-based queuing. Priority-based queuing sets the priority among the classes and the highest priority class is scheduled out first. This is suitable for short-lived, extremely important, or transaction- oriented flows. However, priority-based queuing cannot quantitatively guarantee/limit the bandwidth for a class. As an analogy, if everyone is VIP, then no one is real VIP. In contrast, rate-based queuing employs various packet scheduling algorithms [2] that can decide from which class comes the next packet for transmission. This can effectively limit senders who are trying to overburden the resource. Besides, the minimum bandwidth for important applications can be quantitatively guaranteed. Floyd and Jacobson [3] further investigate the bandwidth borrowing among the classes. Queuing has different impacts upon UDP and TCP data flows. Next we briefly review UDP and TCP protocols. 2
  3. 3. Queuing the Internet Traffic: TCP vs. UDP The majority of software applications today use TCP (Transmission Control Protocol) for data transmission because TCP can establish a reliable end-to-end connection. TCP receivers acknowledge the successful reception of each data packet by replying an Ack to their TCP senders. Thus, Ack packets can trigger senders sending out new data packets. Unacknowledged data packets are retransmitted to guarantee reliability of data transfers. TCP also incorporates flow control mechanisms that prevent a sender from overburdening the network capacity or overflowing its receiver’s buffer. Thus each TCP sender keeps two window values, congestion window (CWND) and receiver advertised window (RWND), and seeks to satisfy both network capacity (congestion control) and receiver's capability of receiving the data, respectively. So each TCP sender do not have unacknowledged data more than min(CWND, RWND). RWND is advertised by the receiver in TCP Ack packets and ranges widely among operating systems. CWND is kept increasing exponentially during the slow-start phase and linearly during the congestion avoidance phase, probing available bandwidth until packet losses occur. Loss behavior differs among versions but mainly on how the CWND is shrunk and raised, or how the lost segments are accurately retransmitted. Falls and Floyd [4] give a good overview and problems on Tahoe, Reno, NewReno, and SACK TCP versions. Padhye and Floyd [5] further investigate the TCP version distribution among 4550 Web servers. Unlike TCP, UDP (User Datagram Protocol) lacks the connection establishment, reliability of data transfer, and flow control. UDP only provides port number multiplexing and is commonly used by real-time applications such as video conferencing and Voice over IP (VoIP). Queuing has different impacts upon UDP and TCP flows. As for real-time UDP traffic, the bit rate is often fixed and the video/voice quality heavily depends on the loss rate, delay, and delay jitter. The packet scheduler must precisely allocate enough bandwidth for real-time UDP traffic to minimize packet losses and delay at the controlling device. Moreover, the packets of the real-time traffic require to be smoothly scheduled out with even intervals for minimizing the delay jitter. As for TCP traffic, TCP flows competing for the same queue can cause a great amount of data packets queued in the device, resulting in high buffer requirement and large packet latency at the device. Moreover, the TCP flows may not fairly share the class bandwidth, especially when their round-trip times (RTT) are different. Thus many vendors apply specific algorithms for regulating TCP traffic. Specific Algorithms for TCP Traffic To guarantee each TCP connection bandwidth within a class, and hence achieve fairness among the flows within a class, the ideal solution is to actively control the sending rate of each sender within the class instead of letting them compete with each other. Thus queuing and its queuing delay, buffer requirement can be reduced. Other types of traffic such as UDP can only resort to the primitive solution, queuing, to passively control its bandwidth. Two methods exist for controlling each TCP connection: (1) window-sizing and (2) packet-dropping. 3
  4. 4. 1. Window Sizing: Since a TCP connection can be actively controlled through the feedback Acks, the window-sizing method directly influences the amount of sending bytes by shrinking the RWND in the TCP Acks. In this test, iPolicer, PacketShaper, WiseWAN, QoSWorks and Guardian Pro belong to this type. Karandikar et al. [6] sponsored by Packeteer investigate the window-sizing technique. Though window-sizing can directly control per-connection bandwidth, it needs to readjust its Ack regulations when another connection enters or leaves the class. 2. Packet Dropping: Because a TCP sender slows down its transmission rate in response of network congestion by halving its congestion window size, the packet-dropping method drops packets and expects that the sender will slow down its rate when detecting the packet loss events [7]. In this test, FloodGate (uses per-flow queuing) and ALTQ_CBQ+RED belong to this type. This work designs a novel testbed for evaluating the effectiveness of various policy enforcement techniques used by existing products or solutions. The testbed mimics the real-life Internet characteristics such as WAN delay, delay jitter, and packet loss. Section 2 compares the relevant information of the devices under test (DUT). Section 3 then describes the design of our testbed and the test methodology. Section 4 demonstrates the test results. Finally, a summary of the test results and conclusions are given in Section 5. 2. Device under Test (DUT) This test project invites nine vendors, and six of them join this test. Table 1 compares the relevant information of all the DUTs. Most DUTs are installed at LAN-router link to prevent router queues from overflowing and causing congestion. Because the grade of each DUT differs, so only low bandwidth configurations (below 1.544Mbps) are tested. This minifies hardware differences so that test results can reflect true management capability of each DUT. Vendor/ Grade S/W OS, Install at Hardware Ver. Boot CPU RAM Interface Fail Log to from Over Model (Announced) HW/SW ALTQ 2.2 [8] 100Mbps 2.2 FreeBSD, Software Between Our P!!! 700MHz PC with 256M N Same FreeBSD NetGuard’s Guardian Pro [9] 10Mbps 5.02 NT 4.0, Software LAN SDRAM, 2 Intel 100M NICs installed, N Same NT server CheckPoint’s FloodGate [10] 45 Mbps 4.1 NT 4.0, Software booting from a hard disk. HA* Same NT server BroadWeb/Acute’s iPolicer 100 Mbps 1.6.4 Embedded NT, Flash P!!! 128M 10/100Mbps N Another NT server 100-CR2202 [11][12] Hardware and 32M 600 Packeteer’s PacketShaper 45 Mbps 4.1.2 Embedded Linux, Flash P!!! 128M 10/100Mbps Y Embedded Hard Hardware Router 600 Disk Hard P!!! 192M 10/100Mbps Y Embedded Hard 4500 [13] Disk 600 Disk Sitara’s QoSWorks 100 Mbps 1.8 Embedded FreeBSD, Hardware QWX-10000 [14] 4
  5. 5. NetReality’s WiseWan 5Mbps 4.0 Proprietary, WAN link Flash P 32M V.35 Y Another NT server Hardware 32M 200/500 [15] 133 (10Mbps log) Note 1: Invited venders also include Lucent’s Access Point, Allot’s NetEnforcer ( these two decide not to join this test after examining our test plan ) and Cisco’s Cisco Assure (did not want to join at the beginning). Note 2: Fail Over is defined as the capability of bypassing traffic when the power is off. HA means high availability module (optional). Note 3: Sitira revealed to us that QoSWorks uses ALTQ_CBQ. Table 1: Product information and software/hardware platforms 2.1 Functionality of Policy Console Network administrators use policy console to define organizational bandwidth policy rules. Table 2 lists the functionality of each policy console. All DUTs can limit the bandwidth of a class. Moreover, most DUTs can guarantee the minimum bandwidth of each connection within the class, except for Guardian Pro and ALTQ. These two settings can be further set by (a) inter-class bandwidth borrowing and (b) intra-class bandwidth borrowing, respectively. In (a) the DUTs can redistribute any available bandwidth unused by some classes to other active classes; in (b) if any flow in a class terminates, its bandwidth will be fairly redistributed to other flows. Vendor/ Packet Classifier Direction UDP WAN Per-Class Bandwidth Control Bandwidth Borrowing traffic Link Src/Dst IP/Port#, Host control Speed Class Guarantee BW for each Inter-class Intra-class Model mask, Prot. ID list (In/Out) Setup connection in the class limit ALTQ Y N Both Y Y Y N Auto Compete2 NetGuard’s Guardian Pro Y Y Both Y Y Y N Degree1 Compete CheckPoint’s FloodGate Y Y Both Y Y Y Y Degree Degree NetReality’s WiseWan Y Y Both Y Y Y Y Auto Auto Acute/Broadweb’s iPolicer Y Y Both N N Y Y N N Packeteer’s PacketShaper Y Y Both Y Y Y Y Degree Degree Sitara’s QoSWorks Y N Both Y Y Y Y Auto Auto 1 Degree means that administrators can manually specify the degree of bandwidth borrowing. 2 DUTs without connection guarantee let the flows within the class compete with each other. Table 2: Functionality Comparison of the Devices under Test 2.3 Protocol Support Table 3 compares the protocol support of each DUT. Most Internet services/protocols can be recognized by layer-4 TCP/UDP port numbers. However, layer-7 awareness can increase the simplicity and capability of bandwidth management. For example, FTP protocol includes the passive mode, in 5
  6. 6. which FTP-data port (port 20, for sending data) can be dynamically changed to another by negotiation in the FTP-Cmd port (port 21, for sending FTP commands). If the DUT cannot recognize what negotiation is in the FTP-Cmd port, obviously it cannot control the connection that is actually sending the data. PacketShaper and WiseWAN have the richest layer-7 awareness. In terms of quantity of port-service mapping entries, WiseWAN and PacketShaper are the richest. The next richest are FloodGate and Guardian Pro. iPolicer, QoSWorks, and ALTQ have few or no built-in port-service mapping entries and require manual lookups in the port-service mapping table. Although iPolicer can identify UDP, it cannot control its bandwidth. Vendor/ Layer awareness Built-in port-service mappings ICMP IPX # of other protocols Layer Layer-7 TYPE TCP UDP Model ALTQ 4 N 0 (Manually assign port #) N N Manually assign port # NetGuard’s Guardian Pro 4 N 60 35 Y N 15 CheckPoint’s FloodGate 7 URL/MIME-TYPE 60 35 Y N Manually assign port # NetReality’s WiseWan 7 URL/MIME-TYPE 109 79 Y Y Above 250 Acute/BroadWeb’s iPolicer 4 N 12 Cannot control Y N Manually assign port # Packeteer’s PacketShaper 7 URL Total above 200 (layer 2 ~7) Y Y Above 200 Sitara’s QoSWorks 4 N 0 (Manually assign port #) N N Manually assign port # *Note: This table only lists the protocols that can control rather than just recognize only. Table 3: Comparison of Protocols Support Appendix A-1 and A-2 further compare the policy console user interface and special functions of the DUTs. Most DUTs mix priority-based and rate-based queuing, however, this test focuses on “rate- based policy” that controls “TCP connections flowing from enterprises (LAN) to WAN” since TCP traffic occupies most of the Internet traffic. As for UDP traffic, this test focuses on real-time applications such as Voice over IP (VoIP). Differences between configured bandwidth and measured results will be quantified. 3. Testbed and Test Methodology Testbed and test methodology significantly influence test results and require careful examination to avoid misinterpretation of the results. 6
  7. 7. 3.1 Testbed: Mimics the Real-Life Internet Internet is very dynamic. Different connections have different paths and therefore have different distances and path qualities. Our testbed mimics the above properties by setting WAN delay, WAN delay jitter, and WAN packet loss rate to each routing path. Figure 1 and Table 4 shows complete information about our testbed and testing tools. Testing data flows are from X to Y, passing through the DUT, routers, monitoring point, and WAN emulator. The Cisco routers are installed specifically for WiseWAN because of its V.35 interface. Each DUT is individually tested on this testbed. Appendix B displays our testbed photo. IP-aliasing employed at A and I in Fig.1 emulates multiple competing sources and their corresponding sinks, respectively. Self-written wan-emu virtual interface driver is used to emulate the dynamics of the Internet. They are detailed as follows: 192.168.88.X 172.16.88.X 172.16.86.X 172.16.87.X 172.16.89.X 10.1.1.X ncftpput FloodGate tcpdump tcpdump QoSWorks GuardianPro WAN Emulator FTP Server TTT A 100M Hub PacketShaper IPolicer 1 E 254 1 F 254 Linux Linux 2.2.14 100M Hub I Linux C NT 4 Server 2.2.14 P-III 700 P-III 700 Linux 2.2.14 2.2.14 Cisco P-III 700 Source P-III 700 253 B P-III 700 Cisco 1 G 254 1 H 254 Destination 254 254 Router 2514 Router 2514 1 ~ 99 1 ~ 99 Report Data 252 D 100 WiseWAN V.35 Cable Voice J Voice Voice 100 Dest 2 Cisco 1750 Dest 1 Telephone Src 1 Voice VOIP Gateway RTP Cisco 1750 P Src 2 K (G.729) VOIP Gateway Telephone 192.168.88.201 L 10.1.1.201 Win2K N SmartBits P-III 700 Y X COM2 M SmartVoIpQoS Figure 1: The Testbed: Mimic the Real-life Internet Note: All PC are equipped with Intel Express Pro 10/100Mbps network interface cards. The V.35 serial clock rate between Cisco routers is set to 2Mbps. Each DUT is individually tested on this testbed. Tool Function Description Position in Fig. 1 Ncftpput [16] TCP Traffic Traffic: 20 ncftputs flows from subnet X to subnet Y. A generator Packet size: 1,500 bytes TCP options: SACK/timestamp/window-scaling disabled. SmartVoIpQoS VoIP (UDP) traffic Traffic: Single VoIP flow with RTP format UDP packets. M [17] generator Codec: G.729 (50 frame/sec, frame size=74 byte, around 30kbps) VoIP Gateway Same as above Same as above K and N ttt [18] Real-time traffic Monitor the bandwidth of the traffic passing through it by protocols, G bandwidth monitor source/destination IP, etc. Tcpdump [19] Packet sniffer Dump each packet’s header to the RAM disk to avoid I/O overheads. A and H 7
  8. 8. Self-written AWK Data Analyzer Calculating statistics from the tcpdump result. G scripts [20] Self-written wan WAN Emulator To have different delays, delay jitters, and random/periodic packet H emulator [20] loss rates impairments on different flows. Table 6: Testing Tools 3 1. IP-aliasing : In Linux each network interface card (NIC) can emulate 100 NICs, with each virtual NIC having a unique IP address. With proper routing table setup at A in Fig.1, we can direct certain flows destined to a certain virtual NIC at I through a virtual NIC at A. Virtual NICs generate packets with their corresponding IP addresses such that the DUT will feel that outgoing TCP data packets are from different local hosts, and incoming TCP Acks are from different remote hosts. Moreover, packets are sent without link-layer collisions since only a single physical NIC is present at A and I. 2. wan-emu: Wan-emu is a Linux virtual interface driver that resides between the IP layer and the NIC driver. In this testbed, multiple wan-emu virtual devices are attached to the sink-side last-hop NIC driver (at H with IP 10.1.1.254) to have different impairments on different routes. With proper static route, we can direct flows destined to a virtual NIC at I through a specific wan-emu interface that has the desired link characteristics. Each packet passing through is pasted a timestamp indicating the time for it to be kicked out. An interrupt is triggered every 1ms to examine how many packets are due and should be forwarded. The timer granularity can be easily tuned to 8192 Hz in Linux. Impairments such as the random/periodic loss rate and delay jitter are also implemented. 3.2 Test Methodology This test includes three sub-tests: Basic Test, Robustness Test, and Advanced Test. A. Basic Test This test evaluates the accuracy of the class bandwidth and the fairness among the connections within the class. Besides, this test also investigates the stability of each DUT among its five-time runs. The total WAN link bandwidth is set to T1 (1.544Mbps)4 and is partitioned into five classes (20, 40, 128, 256, and 1100kbps), with each class matching four TCP connections. Each class is set to guarantee that each connection has 1/4 of the class bandwidth5. All settings are fixed without any bandwidth borrowing. This test repeats in consecutive five runs, with 200 seconds intervals in between. Within each run, 20 FTP connections are simultaneously flowing from A to I (Table 6), with each class match 4 connections. After 250 seconds, all the ncftpput processes are killed. Data from 30 to 230 seconds are analyzed. The statistics are explained in Table 7. Appendix C uses an intuitive example to illustrate the 3 Note that some operating systems merely support alias IP addresses, but cannot support alias interfaces, such as FreeBSD and Windows 2000. 4 BroadWeb/Acute iPolicer does not have WAN link speed setup. 5 NetGuard Guardian Pro cannot accept per-connection setting. 8
  9. 9. following statistics. Statistic Quantify what? Definition Comparison Standard The differences between: Averaged normalized goodput* The closer to 1, the Accuracy (1)the class bandwidth settings  5 measured class goodput for Run i  better ∑   5  (2)the measured class  i =1 given class goodput for Run i  bandwidth Stability of The differences of the accuracy CoV** of normalized goodput among the five runs It depends***. accuracy statistics among the five runs. (Same as above, but take the CoV among the 5 runs instead of the average.) Fairness Fairness of bandwidth usage Averaged CoV among 4 connections’ goodputs The closer to 0, the among the 4 connections in better each class.  5   ∑ CoV of goodputs(among the 4 connections) in Run i  5  i =1  Stability of Differences of the “fairness Same as above, but take the standard deviation It depends***. fairness statistic among the five runs” among the 5 runs instead of the average. Retransmission Retransmission ratio in each  5 Retransmitted Packet Count for Run i  The closer to 0, the ∑  5 Ratio class.  i =1 Total Packet Send Count for Run i  better * Goodput is the effective throughput (bytes/time) excluding the bandwidth consumed by retransmission. ** CoV denotes “coefficient of variation,” which means “standard deviation over mean.” *** If the accuracy tends to 1, it would be better for its stability to be 0. This implies the DUT always performs accurately. However, if the accuracy tends to 0, and its stability also tends to 0, it implies that the DUT always performs inaccurately. This also applies to fairness and its stability (Appendix C). Table 7: Basic Test Statistics B. Robustness Test Packets may be generated by different operating systems, hence different TCP implementations, and pass through paths with various delays and loss rates. Long-distance TCP connections are expected to be vulnerable to Internet losses because they require more time to obtain Acks for recovering to their target bandwidth. Since many DUTs regulate TCP Acks, it is our concern whether they are compatible with the major operation systems. Table 8 describes our test methodology. Test Item Description Comparison standard DUT Settings Test Methodology Under Heterogeneous Same as Basic WAN delays of the four connections in each class Same as the Basic Test Internet Delays Test. are 10ms, 50ms, 100ms, 150ms Under Various 200kbps for A single TCP connection is tested under 0.5%, 1%, Whether the goodput can Internet Loss Rates the test flow. 2%, 4% and 8% periodic loss rates. smoothly degrade. Under Different 80kbps for the (1)WAN: delay=50ms, periodic loss rate=1%. How closely the byte-time Sending Operating test flow. lines of the operating Systems (2)TCP Source OS= {Linux 2.2.14, Windows 2000, systems can overlap with each other. FreeBSD 4.0, Solaris8}. (3)TCP Receiver OS= Linux 2.2.14. (4)Each time a single TCP connection is tested. Table 8: Robustness Test Methodology 9
  10. 10. C. Advanced Test This test includes bandwidth borrowing test and VoIP quality test. Bandwidth borrowing has been described in Section 2. VoIP quality is separately tested through SmartBits and VoIP Gateway to evaluate whether the DUTs can precisely allocate adequate bandwidth for voice traffic. Each test is conducted under heavily-loaded FTP traffic. Detailed test methodologies are in Table 9. Test Item Description Comparison DUT Settings Test Methodology standard Inter-class (1) Link speed=T1 (1.544Mbps), divided Connection 1 and 2 are started and stopped in (1) Stability of Bandwidth into 2 classes A, B. A=B=777kbps. sequence. each Borrowing connection. (2) Class A matches connection 1, Class B matches connection 2. (2) How (3) A and B can borrow with each other. seamlessly the Intra-class (1) Link speed=T1 (1.544Mbps), divided total Bandwidth into 1 classes A. A=1.544Mbps. Borrowing bandwidth line (2) The class matches connection 1 and 2. can be when (3) Per-connection bandwidth: at least connection 1 777kbps, at most 1.544Mbps. terminates. VoIP test using (1) Link speed={T1,125kbps}, divided Background: 20 FTP connections. PSQM1, jitter, SmartVoIpQoS into 2 classes A, B. delay and loss. Foreground: a 30kbps G.729 VoIP flow. VoIP test using Background: 20 FTP connections. Listening with VoIP Gateway (2) A=30kbps for voice traffic, ears2. (Cisco 1750) B={T1,125kbps}-30kbps for FTP Foreground: Dial a phone (JP to NP, G.729 traffic. codec), hold X’s and Y’s phones, (3) FTP traffic can occupy the voice class speak 1 to 10 at 2 word/sec, and until voice traffic begins. judge the voice quality. 1 PSQM (Perceptual Speech Quality Measurement) is calculated from delay, jitter, and loss statistics. PSQM rated as 6.5 has the poorest quality 2 The VoIP Gateway is set to continuously sample the sound even when the primary tester keeps silent. Thus the data flow is always around 30 kbps. Table 9: Advanced Test Methodology 4. Benchmark Test Results A. Basic Test Results A-1. Accuracy and Stability of Accuracy Figure 2 (A1 is accuracy, B1 is its stability, A2 and B2 will be discussed in the robustness test) reveals that the DUTs can be classified into three groups: ALTQ_CBQ, PacketShaper, and QoSWorks 10
  11. 11. have the most accurate and stable control for each class; WiseWAN and FloodGate are less effective in the narrowband class (20kbps) because of their large retransmission ratios as will be shown in section A-3; iPolicer and Guardian Pro are the least effective. iPolicer has several terminated connections in the middle of each run. Thus those connections not sending data waste bandwidth and result in instability among the five runs6. Figure 2: Results of accuracy and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay) A-2. Fairness and Stability of Fairness Figure 3 (A1 is fairness, B1 is its stability, A2 and B2 will be discussed in robustness test) also distinguishes three groups: PacketShaper is the most fair and stable; QoSWorks is less fair but is stable in the 20kbps class, implying that it is less fair in the 20kbps class in all the five test runs (Appendix C). FloodGate and WiseWAN are less fair and stable in the 20kbps class. iPolicer, Guardian Pro, and ALTQ_CBQ+RED provide poor fairness. Pure CBQ has the poorest fairness under narrowband (20~40kbps) classes. However, it is somewhat alleviated after applying RED to each class because RED tends to drop more packets from the connection that is more aggressively sending the data. 6 Note: The test crew had performed many “five-run” tests on iPolicer. It is only after the above phenomenon has been verified that we include the most general one of the “five-run” tests in our analysis. 11
  12. 12. Figure 3: Results of fairness and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay) A-3. Retransmission Ratio Figure 4 A1 (A2 will be discussed in robustness test) shows large retransmission ratio in narrowband classes (20~40kbps), except for PacketShaper and QoSWorks, but especially in WiseWAN, iPolicer, FloodGate and ALTQ_CBQ+RED. As an analogy, a small exit often keeps many people waiting before it. FloodGate and ALTQ_CBQ+RED use “packet dropping” to slow down TCP flows so they have high retransmissions. WiseWAN has enormous packet losses at the Cisco router before WiseWAN can control the traffic at the WAN link. Results of iPolicer are not easy to comprehend in terms of the technologies it claims (adjusting the TCP window size). 12
  13. 13. Figure 4: Test results of retransmission ratio (A1: No Internet Delay; A2: With Internet Delay) B. Robustness Test Results B-1. Under Heterogeneous Internet Delays To make it easy to compare with the Basic Test, the test results are listed with those of Basic Test. Figure 2 (A2, B2), Figure 3 (A2, B2) and Figure 4 (A2) separately demonstrates the results. Most results scales up the differences among the DUTs in the Basic Test, especially with iPolicer and ALTQ_CBQ in the fairness statistic. Long-distance connections are vulnerable to packet losses due to buffer overflows at the controlling device, as described in Section 3.2 B. ALTQ_CBQ+RED can alleviate the unfairness degree of ALTQ_CBQ because the short-distance connections, which are more aggressively sending the data, have more packets dropped by the RED mechanism. Guardian Pro cannot guarantee each connection and thus reveals significant instability between Basic Test and this test. QoSWorks is less fair under the broadband class (1.1Mbps). B-2. Under Various Packet Loss Rates Normally a TCP flow slows down its transmission rate when packet losses occur. Figure 5 shows the goodput of each DUT under different Internet packet loss rates (each flow is with 200kbps and the measured goodput is averaged over 200 seconds as in Basic Test). Almost all the DUTs can smoothly lower their goodputs as packet loss rate increases, except for PacketShaper and iPolicer. These two devices give up sizing the TCP window when they have detected the TCP loss events (triple duplicate Acks). Thus, the TCP sending window suddenly bumps up and causes a burst of packets flowing to the controlling device, resulting in a higher goodput at 0.5% loss rate. This phenomenon is alleviated when increasing the packet loss rate. 13
  14. 14. IPolicer FloodGate 200 Bandwidth (kbps) PacketShaper WiseWAN GuardianPro 180 QoSWorks ALTQ_CBQ 160 0 0.5 1 2 4 8 Loss rate (%) Figure 5: Robustness Test—goodput under various packet loss rates B-3. Under Different Sending Operating Systems In this compatibility test (see Fig.6, the X axis is time, Y axis is the bytes sent, thus the slope is the bandwidth), TCP connections sending from different operating systems passing through PacketShaper have different results. PacketShaper shrinks the TCP window to the condition that no more than 4 packets are in the WAN pipe. Thus, each packet loss resorts to a retransmission timeout instead of using fast retransmit [21]. Since BSD-derived UNIX systems use a coarse-grained retransmission timer (500ms) [21] such that they slowly retransmit the lost packets. In contrast, Linux keeps a fined-grained retransmission timer and has the best performance when packet losses occur. iPolicer has a serious bug when sending data from Windows 2000 to Linux 2.2.14. The tcpdump tool found that the TCP Ack header length is miscalculated when passing through iPolicer, causing incorrectly triggering of data packets from TCP senders. TCP has many options and various implementations, so explicitly modifying the packet header requires sever compatibility tests. The other products can fairly treat TCP flows from different operating systems. Figure 6: Robustness test— Under different Sending Operating Systems C. Advanced Test Results C-1. Bandwidth Borrowing Test Results This test uses ttt to observe the effectiveness of bandwidth borrowing. In each figure we only focus on three lines: the total bandwidth (ip/ether line), the bandwidth of connection 1 (xxxx/tcp line) and the 14
  15. 15. bandwidth of connection 2 (yyyy/tcp line). The test crew draws another baseline indicating the ideal total link bandwidth (1.544Mbps) for comparison. Inter-Class Bandwidth Borrowing Test Results Figure 7 shows the inter-class bandwidth borrowing benchmark results. iPolicer does not have this function, so we set the bandwidth of both of the two classes as 1.544Mbps. However, Cisco Routers link is set to 2Mbps, thus the two 1.544Mbps flows through iPolicer exceeds the baseline bandwidth. After connection 1 terminates, the total bandwidth narrows down to around 1.5Mbps with some bandwidth fluctuation. WiseWAN and ALTQ can automatically borrow bandwidth among classes, and the others can be further configured with the degree of bandwidth borrowing. Guardian Pro has an unstable look when connection 2 starts to obtain a bandwidth share. ALTQ_CBQ and ALTQ_CBQ+RED can only borrow a limited bandwidth (from 777kbps to 1.1Mbps). FloodGate, PacketShaper and QoSWorks can perform inter-class bandwidth borrowing seamlessly. (a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard GuardianPro (d) ALTQ_CBQ (e) Packeteer PacketShaper (f) Sitara QoSWorks (g) NetReality WiseWAN (h) ALTQ_CBQ+RED Figure 7: Inter-class Bandwidth Borrowing Test Intra-Class Bandwidth Borrowing Figure 8 shows the intra-class bandwidth borrowing benchmark results. iPolicer lacks this function so after connection 1 terminates, connection 2 cannot occupy the newly available bandwidth within the class. Guardian Pro and ALTQ_CBQ have fluctuating bandwidth sharing between the two connections since they cannot guarantee per-connection bandwidth. This phenomenon in ALTQ_CBQ is again slightly alleviated after applying RED. The other four products are quite similar in this test, except that PacketShaper and FloodGate have little gaps. 15
  16. 16. (a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard Guardian Pro (d) ALTQ_CBQ (e) Packeteer PacketShaper (f) Sitara QoSWorks (g) NetReality WiseWAN (h) ALTQ_CBQ+RED Figure 8: Intra-class Bandwidth Borrowing Test C-2. VoIP Quality Test This test does not include iPolicer since presently it cannot control UDP traffic. This test is performed by the Smartbits and by the Cisco 1750 VoIP gateways. The former gives quantitative results while the latter judges the voice quality through hearing. Figure 9 (a) shows that under T1 WAN link (1.544Mbps) the DUTs differ in latency and jitter. However, the ultimate voice quality grades (PSQM) are similar except for ALTQ_CBQ. This is also verified by the VoIP Gateway (Table 10) test. We thus conclude that under T1 access link the G.729 bit rate can be easily allocated. In contrast, under 125kbps WAN link (Fig.9 (b) and Table 10), the voice can only barely be recognized with PacketShaper. Transmitting a large packet (1500 bytes) to the narrowband WAN link (125kbps) takes a long time such that its following small voice packet (74 bytes) has to wait until the previous large packet is completely scheduled out. However, after QoSWorks 1529.3163 837.5684 Latency and jitter Latency and jitter 250 30 3000 80 70 200 25 2500 60 20 2000 50 Latency (ms) 150 Jitter (ms) (ms) (ms) 15 1500 40 100 81.0739 30 10 1000 20 50 10.8978 30.6303 500 1.0533 5 10 0 0 0 0 Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks ALTQ_CBQ Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks QoSWorks2 ALTQ_CBQ Average Latency Max Latency Jitter (Latency Variation) 121.2768 Average Latency Max Latency Jitter (Latency Variation) PSQM and Loss Rate 6.5 PSQM and Loss Rate 25 8 100 6 6.5 6.5 6.5 6.5 6.5 6.23 5 20 80 6 Loss rate (%) Loss rate (%) 4 15 60 PSQM PSQM 2.48 2.56 2.45 2.7 2.6 4 3 2.2 2.6 10 2.2 40 2 5 2 20 1 0 0 0 0 Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks ALTQ_CBQ Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks QoSWorks2 ALTQ_CBQ PSQM Loss rate PSQM Loss rate (a) T1 WAN link (1.544Mbps) (b)125kbps WAN link Note: “Base” results are conducted under clean testbed without enabling any DUT. The G.729 Codec is not lossless compression. Even though the jitter and loss is few, the PSQM is at least 2.2. Figure 9: VoIP Test Results of SmartVoIPQoS 16
  17. 17. exercises Packet Size Optimization (minifying the maximum transfer unit of FTP connections when establishing the connections), the voice quality approaches the original voice both in Smartbits and Gateway tests. While it is promising, readers should be aware that minifying the packet size of all other TCP connections can cause large overhead. As an analogy, the overhead of several small trucks carrying the goods is larger than that of a big truck carrying the same goods. This tradeoff depends on the considerations of the network administrator. T1 WAN link Speed 125kbps WAN link Speed Calling time Delay time Voice quality Calling time Delay time Voice quality (legibility) (legibility) (estimated by ears) (estimated by ears) Baseline (only voice) About 0.2 sec Very good <1 sec Very good Very short(< 0.1 sec) Very short(< 0.1 sec) Baseline (with background FTP) Cannot establish the connection Cannot establish the connection iPolicer Cannot be tested(do not support UDP traffic control) Cannot be tested(do not support UDP traffic control) FloodGate About 0.5 sec Very good About 7sec About 1 sec Very short(< 0.1 sec) Very Poor(<10%) Guardian Pro About 0.5 sec Very good About 3 sec About 1.5 sec Very short(< 0.1 sec) Ultra poor(<1%) WiseWAN About 0.5 sec Very good About 7sec About 1.5 sec Very short(< 0.1 sec) Ultra poor(<1%) PacketShaper About 0.5 sec Very good About 1 sec About 1 sec Poor (60%) Very short(< 0.1 sec) ALTQ_CBQ About 2 sec Very good About 18 sec About 1 sec Very short(< 0.1 sec) Very Poor(<10%) QoSWorks About 1 sec Very good About 17 sec About 1 sec Very short(< 0.1 sec) Very Poor(<10%) QoSWorks Optimized Not tested (no need to) About 6 sec Very good Very short(< 0.2 sec) Table 10: VoIP Test Results Through VoIP Gateway 5. Conclusions This work designs a novel testbed that mimics the real-life Internet conditions, such as multiple connections, heterogeneous WAN delays/delay jitters/packet loss rates, and different TCP source implementations. Most test reports, such as those by the Tolly Group [22], are financed by the vendors and may be biased. Additionally, the testbed in those reports is over-simplified, without in-depth test items or with inadequate number of connections. This work first classifies the policy rules into three major types: (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes. The test methodology then quantifies the effectiveness of the above policy rule types of each device in terms of accuracy, fairness, stability, robustness, bandwidth borrowing, and VoIP quality. The test results reveal several things that can be reproducible with our open tools: (1) the narrowband class-based rule and its fairness among the flows are harder to enforce when multiple TCP connections compete for the same queue, resulting in large queue length and TCP retransmissions. (2) explicitly sizing the TCP window could cause performance or fairness degradation even under slight packet loss rates; (3) the open source solution can compete with commercial products in accurately limiting flow aggregates; (4) the video/voice qualities of real-time applications significantly depends on the packet 17
  18. 18. sizes of all other traffic when using a narrowband (125kbps) access link. Detailed functionality comparison among the DUTs gives further directions for enhancing open source solutions, such as Packeteer’s traffic discovery and QoSWorks’s intuitive user interface. The ALTQ package lacks per- connection bandwidth guarantee within the class that it needs further refinements to satisfy the enterprises’ demand. Some vendors in this test use open sources but never do they open their kernel patches. We are currently patching ALTQ with per-connection bandwidth guarantee and will feedback to the Open Source community. After all, open source should be open. 6. References [1] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, An Architecture for Differentiated Services, RFC 2475, Dec. 1998. [2] Stiliadis, and A. Varma, Latency-Rate Servers: A General Model for Analysis of Traffic Scheduling Algorithms, IEEE/ACM Transactions on Networking, Vol. 6, No. 5, pp.611-624, Oct. 1998. [3] S. Floyd, and V. Jacobson, Link-sharing and resource management models for packet networks, IEEE/ACM Transactions on Networking, Vol. 3, No. 4, pp.365-386, 1995. [4] K. Fall, and S. Floyd, Simulation-based Comparisons of Tahoe, Reno, and SACK TCP, ACM Computer Communication Review, Vol. 26 No. 3, pp.5-21, Jul. 1996. [5] J. Padhye, and S. Floyd, On Inferring TCP Behavior, ACM SIGCOMM'2001, San Diego, USA, August, 2001. http://www.acm.org/sigcomm/sigcomm2001/p23.html (to be appeared) [6] S. Floyd and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance, IEEE/ACM Transactions on Networking, Vol. 1, No. 4, pp.397-413, Aug. 1993. [7] S. Karandikar, S. Kalyanaraman, P. Bagal, and B. Packer, TCP Rate Control, ACM Computer Communication Review, Vol. 30, No. 1, Jan. 2000. [8] K. Cho, Alternate Queueing for BSD UNIX (ALTQ), http://www.csl.sony.co.jp/person/kjc [9] NetGuard Corporation, http://www.netguard.com [10] Check Point Software Technologies, http://www.checkpoint.com [11] BroadWeb Corporation, http://www.broadweb.com.tw [12] Acute Communication Corporation, http://www.acutecomm.com [13] Packeteer Corporation, http://www.packeteer.com [14] Sitara Networks, http://www.sitaranetworks.com [15] NetReality Corporation, http://www.net-reality.com [16] Ncftpput Software, http://www.ncftp.com [17] K. Cho, Tele Traffic Tapper (ttt), http://www.csl.sony.co.jp/person/kjc [18] Spirent Communications, http://www.netcomsystems.com [19] Lawrence Berkeley National Laboratory, tcpdump, http://www-nrg.ee.lbl.gov [20] H. Y. Wei, WAN Emulator, http://speed.cis.nctu.edu.tw/wanemu/ 18
  19. 19. [21] W. R. Stevens, TCP/IP Illustrated Volume 1 - The Protocols, Addison-Wesley, 1994. [22] Tolly Group, http://www.tolly.com Acknowledgements We thank the vendors who so generously provided us with the devices and their verifications of the test results. We are grateful to Ching-Chuan Chiang and Yi-Chung Liu for their help on the preliminary tests and functionality comparisons. Appendix Appendix A. Detailed Functionality Comparison A-1. Policy Console User Interface As for the policy console user interface (Table A), a notable function is how many devices a management console can control. Policy consoles of PacketShaper and QoSWorks can control only one device since they use built-in web servers for configuration with Web browsers. Policy consoles of others (except for ALTQ) can remotely control multiple devices located at different places. As for schedule control, per-rule schedule control is more effective. For example, some rules can be inactive during non-office hours, but VoIP rule should be always active to guarantee voice quality. Vendor/Model Type Schedule Management OS Monitor/Statistics Alert Control Console ALTQ Config File N Single device FreeBSD 4.0 Per-class bandwidth usage N/A NetGuard’s GUI Win32 Per-rule Global Win NT/2000 Line Statistics Report/Response Time Log Guardian Pro Application devices Report/Protocol Distribution Report CheckPoint’s GUI Win32 Per-rule Global Win NT/2000 Line Statistics Report/Response Time N/A FloodGate Application devices Report/Protocol Distribution Report NetReality’s GUI Java Per-rule Global Win NT/Solaris Line Statistics Report/Port Report /Response Time SNMP trap WiseWan Application devices Report/Protocol Distribution Report/VoIP Report/Top Ten Talkers/Top Ten Protocols or Apps BroadWeb/Acute’s Web Browser Per-rule Global Web Server Web Client Line Statistics Report/Top Ten Report/ Email trap iPolicer devices Another NT IE 5.0 (Java Applet) Top Ten Talkers/Top Ten Protocols Packeteer’s Web Browser Per-device Single Web Server Web Client Utilization/Network Efficiency/Top Ten SNMP PacketShaper Embedded Any Classes/Top Twenty Talkers/Per-class Bandwidth trap Web Server Usage/Response Time Report (HTML) device 19
  20. 20. Sitara’s QoSWorks Web Browser Per-device Single Web Server Web Client Per-class Bandwidth Usage/Link statistics/Top SNMP Embedded Any classes per link/Top Applications/Protocol trap Web Server Distribution/Traffic by address (HTML) device Table A: Management Interface and Statistics of Flow A-2. Special Functions PacketShaper is superior in its Traffic Discovery, which can automatically identify the protocols of the traffic passing through it and provide an instant feedback to the network administrator for further bandwidth setting. Others have to manually monitor whether the newly specified packet filters can capture its corresponding traffic. WiseWAN is directly installed at the WAN link (V.35 cable) and thus can verify whether the measured bandwidth matches the subscribed bandwidth. Additionally, it can detect PVCs in the frame relay network. Thus a single WiseWAN device can control all the traffic on the mesh-structured frame relay links among branch offices. QoSWorks significantly focuses on controlling VoIP traffic. With shrinking TCP data packet size, VoIP (UDP packets) traffic can pass through QoSWorks smoothly, especially in narrowband WAN link. Moreover, QoSWorks has built-in Web cache (not verified in this report). Both FloodGate and Guardian Pro can be integrated with their firewall, VPN and NAT packages. Integrated solutions may reduce management costs. Appendix B. Testbed Photo Figure B: Testbed Photo 20
  21. 21. Appendix C. Intuitive Example for Basic Test Statistics This intuitive example illustrates how the Basic Test statistics of the 20kbps class are derived. As described in Section 3.2, each class matches four connections, and the test repeats for five runs. Ideally within each run each connection can receive 1/4 of the class bandwidth. The example results tell us that the accuracy statistic is 19, which approaches the ideal result 20, cannot reflect real conditions. With the aid of poor stability of accuracy, we can judge that the DUT is actually not good in accuracy. On the other hand, “Not fair” with “Good stability of fairness” means that the DUT “cannot fairly" treat the flows almost “all the time”. Ideal Round 1 Round 2 Round 3 Round 4 Round 5 5 4 1 12 1 13 20kbps 5 20 4 14 2 7 5 30 2 5 14 39 5 4 2 6 1 5 40kbps 5 10 2 2 7 1 7 10 40 (14+7+30+5+39)/5=19 => Accurate!! 128kbps 10 CoV(14,7,30,5,39)=2.4 =>=> Accurate!! (14+7+30+5+39)/5=19 Poor stability!! 10 CoV(14,7,30,5,39)=2.4 => Poor stability!! 32 1.544Mbps 32 256kbps 32 128 32 CoV=0.23 CoV=0.25 CoV=0.36 CoV=0.35 CoV=0.39 64 CoV=0.23 CoV=0.25 CoV=0.36 CoV=0.35 CoV=0.39 64 256 64 1.1Mbps 64 275 (0.23+0.25+0.36+0.35+0.39)/5=0.32 => Not Fair 1100 Std(0.23+0.25+0.36+0.35+0.39)=0.063 => GoodFair (0.23+0.25+0.36+0.35+0.39)/5=0.32 => Not stability 275 275 Std(0.23+0.25+0.36+0.35+0.39)=0.063 => Good stability 275 Figure C: Intuitive Example for Basic Test Statistics 21

×