• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
LJorgensonPvanEpp.ppt
 

LJorgensonPvanEpp.ppt

on

  • 457 views

 

Statistics

Views

Total Views
457
Views on SlideShare
457
Embed Views
0

Actions

Likes
0
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • - in a typical cross site HPC network there are three potential problem sources - user site to core network on each end (for 2) - core network - In trouble shooting it is very desirable to quickly eliminate as many of these as possible so we would like to do a binary search. - when the network is operational (because of potential for work at odd hours at the local site) this should be user accessable and runable if at all possible - during the implementation phase things are (perhaps) easier since work can hopefully be scheduled before hand and disrupt the network. - here there is a split between dedicated lightpaths and a shared network - on a lightpath the first item should be end to end test (iperf/netperf) - this is likely too disruptive (and not necessarily useful) on a shared network link because of interference to and from other users. Argus is potentially an answer here, but more likely is eliminate the 2 end points first and if they appear clean tackle the core network. - ping and owamp are potentially low impact tools to try on a shared core. - If there is a problem here, outside agencies are going to need to get involved and that will likely take time and involve the network staff at your site which may or may not be possible off business hours. - assuming iperf indicates a good connection through the core network (or the core is shared) then move on to the end sites. - ndt is useful here in that it is intended to be run by an end user with the results potentially passed on to experts later. - again there is a split between lightpaths (typically layer2 and perhaps layer 1) and a shared core which will usually be layer3. - In a shared core IP tools (ping, traceroute, perhaps owamp) can be useful to figure out at which hop trouble likely starts. - in a lightpath things are much more exciting. - the best case is if there is an ndt server nearby (such as at your local gigapop). Then you from your workstation can assess the link between you and the gigapop. If it checks good the problem is either on the far end (between the gigapop and the user) or in the core - if you don't have ndt nearby the endpoints you are in for a lot of grief because there is no effective handle to trouble shoot. There are no layer 3 devices at mid points to answer ping and diagnostics (you may need to arrange for some!). - One extremely dangerous if likely successfull method of trouble shooting is to modify a machine to allow iperf or netperf to function on a loopback. - This is dangerous because introducing a loopback (even by accident which is all too easy) will destroy a layer3 network for as long as the loop back is on (and a while after, as it has to relearn the real network topology.) - it is about the only thing that works from one end with no physical access to intermediate points. You can binary search the netwrork with this by starting at the far end and stepping back half way at each failure. It will only find the first failure between you and the end point but that is typically all thats there (if there are two you will need to do this again after fixing the first or do the same from the other end to speed this up)
  • - in a typical cross site HPC network there are three potential problem sources - user site to core network on each end (for 2) - core network - In trouble shooting it is very desirable to quickly eliminate as many of these as possible so we would like to do a binary search. - when the network is operational (because of potential for work at odd hours at the local site) this should be user accessable and runable if at all possible - during the implementation phase things are (perhaps) easier since work can hopefully be scheduled before hand and disrupt the network. - here there is a split between dedicated lightpaths and a shared network - on a lightpath the first item should be end to end test (iperf/netperf) - this is likely too disruptive (and not necessarily useful) on a shared network link because of interference to and from other users. Argus is potentially an answer here, but more likely is eliminate the 2 end points first and if they appear clean tackle the core network. - ping and owamp are potentially low impact tools to try on a shared core. - If there is a problem here, outside agencies are going to need to get involved and that will likely take time and involve the network staff at your site which may or may not be possible off business hours. - assuming iperf indicates a good connection through the core network (or the core is shared) then move on to the end sites. - ndt is useful here in that it is intended to be run by an end user with the results potentially passed on to experts later. - again there is a split between lightpaths (typically layer2 and perhaps layer 1) and a shared core which will usually be layer3. - In a shared core IP tools (ping, traceroute, perhaps owamp) can be useful to figure out at which hop trouble likely starts. - in a lightpath things are much more exciting. - the best case is if there is an ndt server nearby (such as at your local gigapop). Then you from your workstation can assess the link between you and the gigapop. If it checks good the problem is either on the far end (between the gigapop and the user) or in the core - if you don't have ndt nearby the endpoints you are in for a lot of grief because there is no effective handle to trouble shoot. There are no layer 3 devices at mid points to answer ping and diagnostics (you may need to arrange for some!). - One extremely dangerous if likely successfull method of trouble shooting is to modify a machine to allow iperf or netperf to function on a loopback. - This is dangerous because introducing a loopback (even by accident which is all too easy) will destroy a layer3 network for as long as the loop back is on (and a while after, as it has to relearn the real network topology.) - it is about the only thing that works from one end with no physical access to intermediate points. You can binary search the netwrork with this by starting at the far end and stepping back half way at each failure. It will only find the first failure between you and the end point but that is typically all thats there (if there are two you will need to do this again after fixing the first or do the same from the other end to speed this up)
  • OK. This slide illustrates one way to attack a performance problem. First and foremost, if you are planning a demo, or other event, do test ahead of time. If you have a concerned application community, this may mean periodic testing among points close to key equipment. For example, all of the VLBI sites may test among each other. It may also mean periodic testing within your network to points in Abilene, or other campuses you talk to frequently. Now say you have a problem that the periodic testing did not pick up (there are just too many paths to test them all). The first question – do you have connectivity and reasonable latency? Ping will give you round-trip times, assuming it isn’t blocked along the way. We’ll describe a tool, owamp, that measures one-way delay, which allows you to disambiguate problems that might occur asymmetrically – asymmetric routing, asymmetric traffic queuing, a dirty fiber can cause asymmetric problems (since each fiber transmits light in one direction). Are you seeing many losses with these low-rate tests? If so, there’s something terribly wrong. If the latency is not what you expect, there may be a routing problem. The best-known tool is traceroute, and you can use that to make sure the path looks reasonable. It goes through your campus, possibly through a gigapop, across Abilene and down to the other side in a reasonable fashion (not taking a scenic tour of the US, for example). Remember that you have to test in the opposite direction; the Abilene router proxy and traceroute servers can help. Has the host been tuned? Is there potentially a duplex-mismatch one of the local Ethernet connections? Here, running NDT, also to be described today, can point out a series of common problems. NDT itself relies on web100, which instruments the Linux kernel. You might consider installing a web100 machine (or using machines with web100 code); there are additional diagnostics you can run using the web100-provided variables, and the kernel itself is better “out of the box”: it can automatically tune buffers on some TCP connections. If routing looks reasonable, and the host is reasonable, you may have a problem in the path. (Large losses in the low rate tests also indicate path problems, assuming it isn’t a duplex mismatch problem, local congestion--perhaps a denial of service attack--or even broken network hardware on the end system.) Iperf is a tool to run synthetic TCP streams (memory-to-memory) between two machines. Bwctl is a tool that we will talk about today that adds authentication and scheduling to iperf, and allows you to test to multiple points, including midpoints within Abilene.
  • OK, we talked about network problems. However, you also should know that the number one reason for TCP not running at “full-speed” is for it to be starved for buffer space. Vendors ship TCP stacks with buffers that are tuned for the commercial Internet. If the buffer is too small, TCP, which uses a “sliding window” for flow control, must wait for packets to be acknowledged in order to advance the window and send more data. Essentially the sender is forced to stop and wait. You need to be able to buffer the number of bits you can send in one round-trip time at your desired speed. [[Note that there are also send and receive buffers; if the either is too small you can end up with this problem.]] For example, with a 70 millisecond round-trip time (more-or-less trans-continental North America), to sustain one gigabit per second you need 8.4 megabytes of buffer space. For 100Mbps at the same distance you need 855 kilobytes. Many stacks default to 64 kilobytes, which only allows 7.4 Mbps. One word of caution: network kilobits, megabits, gigabits are powers of 10. Memory kilobytes and megabytes are in powers of two, a kilobyte being 1024 bytes (2^10) and a megabyte being 1,048,576 bytes (2^20). More detail on TCP behavior from Rich: TCP has 2 buffers, send and receive. The sliding window is always used, not just if the buffer is too small. Since TCP delivers a reliable, in-order packet delivery service it needs to detect and recover from loss and mis-ordered packets. The sender must retain a copy of all packet sent in the event that IP packets are lost. If this buffer fills up, the sender must stop sending until ACKs are received. The receiver must also deal with out-of-order packets, so it maintains a reassembly buffer. This buffer also holds packets when the application is unable to process them. If this receive buffer fills up, the sender must stop sending until the application can process the data. So either buffer filling up can cause the sender to stop. Both sender and receiver must be tuned to eliminate buffer stalls. In some cases it may not be possible for the local host or sys-admin to fix the problem (i.e., the remote host has mis-set buffers).
  • I also want to mention that the same problem carries up to the applications themselves. We won’t be speaking more about this today, but for video and audio (streaming media) the lack of buffer space in the application (in our world, MPEG-2 based applications are especially bad) means the application is very sensitive to packet loss or reordering. Of course, if your application is interactive, then increased buffering can lead to lag in response, which is not desirable, either. This generalizes to bad network application behavior, so that they are not robust to network changes or anomalies. Drops will occur. Reordering will occur. Even if only very occasionally. Even applications that would like to use TCP to do bulk transfer can do things like not hand enough data to TCP to allow it to stream over long distances. One that was brought to light recently is scp (and therefore ssh; this problem can also occur with standard FTP); popular versions of scp do not provide large enough buffers for TCP to stream. (There is a pointer to a good version of scp off a tcp tuning page at PSC mentioned later.)
  • So what causes these problems? Here’s a laundry-list of the “usual suspects”. First on the list, and most common is a bad host configuration. As we just mentioned, this is usually because operating systems ship tuned to the commercial internet, and we have very different paths over the Internet2 infrastructure (in particular the “bandwidth delay product” is much greater). Second is duplex mismatch, usually due to autoconfiguration failure, with one side believing it is full-duplex (can send and receive simultaneously), and the other side believing it is half-duplex (can only send or receive one at a time). This is a legacy of how the Ethernet standard has evolved. This is the major cause of “non-congestive” packet losses. Wiring or fiber problems can cause non-congestive packet losses. Bad equipment (anything from host interfaces that cannot run full-speed, to host, switch, router, or fiber equipment failure) can cause excessive delays, jitter, or non-congestive packet loss. Bad routing can cause excessive latency, or sometimes jitter due to multiple different length paths being used. Congestion causes varying delays and packet loss.
  • Three “legs” of critical End-to-end path performance characterization Precise, on-demand diagnostics Continuous, adaptive monitoring Business process constraints Top 5 working constraints
  • Three “legs” of critical End-to-end path performance characterization Precise, on-demand diagnostics Continuous, adaptive monitoring Business process constraints Top 5 working constraints
  • Basically, lots of administration. And the worst part is that there is no consistent policy taken for the amount of resources different users are allowed to use. BWCTL allows you to think about what policy you want and gives you a way to codify and enforce it uniformly.
  • The policy implementation is compile-time pluggable.

LJorgensonPvanEpp.ppt LJorgensonPvanEpp.ppt Presentation Transcript

  • Optimum Performance, Maximum Insight: Behind the Scenes with Network Measurement Tools Peter Van Epp Network Director Simon Fraser University Loki Jorgenson Chief Scientist Conference April 17-18, 2007
  • Overview
    • Network measurement, performance analysis and troubleshooting are critical elements of effective network management.
    • Recommended tools, methodologies, and practices with a bit of hands-on
  • Overview
    • Quick-start hands-on
    • Elements of Network Performance
    • Realities: Industry and Campus
    • Contexts
    • Methodologies
    • Tools
    • Demos
    • Q&A
  • Troubleshooting the LAN: NDT (Public Domain)
    • Preview - NDT
      • Source - http://e2epi.internet2.edu/ndt/
      • Local server - http://ndtbby.ucs.sfu.ca:7123
        • http://192.75.244.191:7123
        • http://142.58.200.253:7123
      • Local instructions –
        • http://XXX.XXX
  • Troubleshooting the LAN: AppCritical (Commercial)
    • Preview - AppCritical
      • Source - http:// a pparentNetworks.com
      • Local server - http://X XX.XXXX.XXX
      • Local instructions –
        • http://XXX.XXX
        • Login: “guest” , “bcnet2007”
        • Downloads
          •  User Interface
            •  Download User Interface
            • Install
        • Start and login (see above)
  • INTRO
  • Network Performance
    • Measurement
      • How big? How long? How much?
      • Quantification and characterization
    • Troubleshooting
      • Where is the problem? What is causing it?
      • Diagnosis and remediation
    • Optimization
      • What is the limiter? What application affected?
      • Design analysis and planning
  • “Functional” vs. “Dysfunctional”
    • Functional networks operate as spec’d
      • Consistent with
      • Only problem is congestion
      • “ Bandwidth” is the answer (or QoS)
    • Dysfunctional networks operate otherwise
      • “ Broken” but ping works
      • Does not meet application requirements
      • Bandwidth and QoS will NOT help
  • Causes of Degradation
    • Five categories of degradation:
      • exceeds specification
        • Insufficient capacity
      • diverges from design
        • Failed over to T1; auto-negotiate selects half-duplex
      • presents dysfunction
        • EM interference on cable
      • includes devices and interfaces that are mis-configured
        • Duplex mismatch
      • manifests emergent features
        • Extreme burstiness on high capacity links; TCP
  • STATS AND EXPERIENCE
  • Trillions of Dollars
    • Global Annual Spend on telecom = $2 Trillion
      • Network/Systems Mgmt = $10 Billion
    • 82% of network problems identified by end users complaining about application performance (Network World)
    • 38% of 20,000 helpdesk tests showed network issues impacting application performance (Apparent Networks)
    • 78% of network problems are beyond our control (TELUS)
    • 50% of network alerts are false positives (Netuitive)
    • 85% of networks are not ready for VOIP (Gartner 2004)
    • 60% of IT problems are due to human error (Networking/CompTIA 2006)
  • Real World Customer Feedback
    • Based on survey of 20,000 customer tests, serious network issue 38% of the time
      • 20% of networks have bad NIC card drivers
      • 29% of devices have packet loss, caused by:
        • 50% high utilization
        • 20% duplex conflicts
        • 11% rate limiting behaviors
        • 8% media errors
        • 8% firewall issues
  • Last Mile
    • Last 100m
    • LAN
      • Workstations
      • Office environment
      • Servers
    • WAN
      • Leased lines
      • Limited capacities
    • Service providers / core networks
  • METHODOLOGIES
  • Real examples from the SFU network
    • 2 links out - one to CA*net4 at 1G usually empty
    • 100M commodity link heavily loaded
      • (typically 6 times the volume of the C4 link)
    • Physics grad student doing something data intensive to a grid site in Taiwan
      • first indication  total saturation of commodity link
      • Argus pointed at the grid transfer as symptom
      • routing problem as the cause
  • Real examples (cont.)
      • problem of asymmetric route
    • 12:45:52 tcp taiwan_ip.port -> sfu_ip.port 809 0 1224826 0
    • ^ ^ ^ ^
    • packets in out bytes in out
      • reported the problem to Canarie NOC who quickly got it fixed
      • user's throughput much increased, commodity link less saturated!
    • Use of NDT might have increased stress !
  • Network Life Cycle (NLC)
    • Network life cycle
      • Business case
      • Requirements
      • Request for Proposal
      • Planning
      • Staging
      • Deployment
      • Operation
      • Review
    Planning Requirements Request For Proposal Business Case Review Staging Operation Deployment Network Life Cycle
  • NLC: Staging/Deployment
    • Two hosts with a cross over cable
      • insure the end points work.
    • Move one segment closer to the end (testing each time)
      • Not easy to do if sites are geographically/politically distinct
    • Establish connectivity to the end points
    • Tune for required throughput
      • One of multiple possible points of failure – lack of visibility
    • Tools (even very disruptive tools) can help by stressing the network
      • Localize and characterize
  • NLC: Staging/Deployment (cont.)
    • Various bits of hardware (typically network cards) and software (the IP stack) flaws
      • default configurations that are inappropriate for very high throughput networks.
    • Careful what you buy
      • (cheapest is not best and may be disastrous)
      • optical is much better, but also much more expensive than copper)
    • Tune the IP stack for high performance
    • If possible try whatever you want to buy in a similar to environment (RFP/Staging)
    • Staging won't guarantee anything
      • something unexpected will always bite you.
  • NLC: Operation
    • Easier if that the network was known to work at implementation
    • Probably disrupting work so pressure is higher
      • may not be able to use the disruptive tools
      • may be occurring at a time when the staff unavailable
    • Support user (e.g. NDT)
      • researcher can point the web browser on their machine at an ndt server
      • save (even if they don't understand) the results for a network person to look at and comment on later
  • NLC: Operation (cont.)
    • automated monitoring / data collection
      • can be very expensive to implement
      • someone must eventually interpret it
    • consider issues/costs when applying for funding
    • passive continuous monitor on the network can make your life (and success) much easier
    • multiple lightpath endpoints or dynamically routed network can be challenging
      • issues may be (or appear to be) intermittant
      • due to changes happen automatically – can be maddening.
  • NLC Dependencies Planning Requirements Request For Proposal Business Case Review Staging Operation Deployment
  • METHODOLOGIES: Measurement
  • Visibility
    • Basic problem is lack of visibility at the network level
    • Performance “depends”
      • Application type
      • End-user / task
      • Benchmarks
    • Healthy networks have design limits
    • Broken networks are everything else
  • Measurement Methodologies
    • Device-centric (NMS)
      • SNMP
      • RTCP/XR / NetCONF
        • E.g. HP OpenView
    • Network behaviors
      • Passive
        • Flow-based - e.g. Cisco NetFlow
        • Packet-based – e.g. Network General “Sniffer”
      • Active
        • Flooding – e.g. AdTech AX/4000
        • Probing – e.g. AppCritical
  • E2E Measurement Challenges
    • Layer 1
      • Optical / light paths
      • Wireless
    • Layer 2
      • MPLS
      • Ethernet switch fabric
      • Wireless
    • Layer 3
    • Layer 4
      • TCP
    • Layer 5
      • Federation
  • Existing Observatory Capabilities
    • One way latency, jitter, loss
      • IPv4 and IPv6 (“owamp”)
    • Regular TCP/UDP throughput tests – ~1 Gbps
      • IPv4 and IPv6; On-demand available (“bwctl”)
    • SNMP
      • Octets, packets, errors; collected 1/min
    • Flow data
      • Addresses anonymized by 0-ing the low order 11 bits
    • Routing updates
      • Both IGP and BGP - Measurement device participates in both
    • Router configuration
      • Visible Backbone – Collect 1/hr from all routers
    • Dynamic updates
      • Syslog; also alarm generation (~nagios); polling via router proxy
  • Observatory Databases – Dat а Types
    • Data is collected locally and stored in distributed databases
    • Databases
      • Usage Data
      • Netflow Data
      • Routing Data
      • Latency Data
      • Throughput Data
      • Router Data
      • Syslog Data
  • GARR User Interface
  • METHODOLOGIES: Troubleshooting
  • Challenges to Troubleshooting
    • Need resolution quickly
    • Operational networks
    • May not be able to instrument everywhere
    • Often relies on expert engineers
    • Does not work across 3 rd party networks
    • Authorization/access
    • Converged networks
    • Application-specific symptoms
    • End-user driven
  • HPC Networks
    • Three potential problem sources
      • user site to edge (x 2)
      • core network
    • Quickly eliminate as many of these as possible
      •  binary search
    • Easiest during implementation phase
    • Ideally - 2 boxes at the same site and move them one link at a time
      • Often impractical – deploy and pray (and troubleshoot)
  • HPC Networks (cont.)
    • Major difference between dedicated lightpaths and a shared network
    • Lightpath: end to end test
      • iperf/netperf on loopback
      • this is likely too disruptive on shared network
      • DANGEROUS
    • Alternately, NDT to local server to isolate
      • Recommended to have at least mid-path ping!
  • HPC Networks (cont.)
    • Shared: see if other users have problems
      • If no core problem, not common
      • If core, outside agencies involved
    • Start trouble shooting
      • both end user segments in parallel
    • Preventive measures
      • support user runnable diagnostics
      • ping and owamp - low impact monitoring
  • E2EPI Problem Statement: “The Network is Broken”
    • How the can user self-diagnosis first mile problems without being a network expert?
    • How can the user do partial path decomposition across multiple administrative domains?
  • Strategy
    • Most problems are local…
    • Test ahead of time!
    • Is there connectivity & reasonable latency? (ping -> OWAMP)
    • Is routing reasonable (traceroute)
    • Is host reasonable (NDT; Web100)
    • Is path reasonable (iperf -> BWCTL)
  • What Are The Problems?
    • TCP: lack of buffer space
      • Forces protocol into stop-and-wait
      • Number one TCP-related performance problem.
      • 70ms * 1Gbps = 70*10^6 bits, or 8.4MB
      • 70ms * 100Mbps = 855KB
      • Many stacks default to 64KB, or 7.4Mbps
  • What Are The Problems?
    • Video/Audio: lack of buffer space
      • Makes broadcast streams very sensitive to previous problems
    • Application behaviors
      • Stop-and-wait behavior; Can’t stream
      • Lack of robustness to network anomalies
  • The Usual Suspects
    • Host configuration errors (TCP buffers)
    • Duplex mismatch (Ethernet)
    • Wiring/Fiber problem
    • Bad equipment
    • Bad routing
    • Congestion
      • “ Real” traffic
      • Unnecessary traffic (broadcasts, multicast, denial of service attacks)
  • Typical Sources of Performance Degradation
      • Half/Full-Duplex Conflicts
      • Poorly Performing NICs
      • MTU Conflicts
      • Bandwidth Bottlenecks
      • Rate-Limiting Queues
      • Media Errors
      • Overlong Half-duplex
      • High Latency
  • Self-Diagnosis
    • Find a measurement server “near me”.
    • Detect common tests in first mile.
    • Don’t need to be a network engineer.
    • Instead of:
      • “The network is broken.”
    • Hoped for result:
      • “I don’t know what I’m talking about, but I think I have a duplex mismatch problem.”
  • Partial Path Decomposition
    • Identify end-to-end path.
    • Discover measurement nodes “near to” and “representative of” hops along the route.
    • Authenticate to multiple measurement domains (locally-defined policies).
    • Initiate tests between remote hosts.
    • See test data for already run tests. (Future)
  • Partial Path Decomposition
    • Instead of:
      • “ Can you give me an account on your machine?”
      • “ Can you set up and leave up and Iperf server?”
      • “ Can you get up at 2 AM to start up Iperf?”
      • “ Can you make up a policy on the fly for just me?”
    • Hoped for result:
      • Regular means of authentication
      • Measurement peering agreements
      • No chance of polluted test results
      • Regular and consistent policy for access and limits
  • METHODOLOGIES: Application Performance
    • Network Dependent Vendors
    • Applications groups (e.g. VoIP)
    • Field engineers
    • Industry focused on QoE
  • Simplified Three Layer Model OSI Layer Description 7 Application 6 Presentation 5 Session 4 Transport 3 Network 2 Data Link 1 Physical Network Behaviors User Experience Application Behaviors
  • New Layer Model User Experience App Behaviors Network Behaviors
  • App-to-Net Coupling
    • Codec
    • Dynamics
    • Requirements
    Application Model Outcomes
    • Loss
    • Jitter
    • Latency
  • E-Model Mapping: R  MOS E-model generated “R-value” (0-100) - maps to well-known MOS score MOS (QoE) E-Model Analysis
  • Coupling the Layers Application Behaviors Network Behaviors Application Models test/monitor for QoE network requirements(QoS/SLA) User / Task / Process
  • METHODOLOGIES: Optimization
  • Network visibility End-to-end visibility App-to-net coupling End-to-end network path
  • Iterating to Performance
  • Wizard Gap Reprinted with permission (Matt Mathis, PSC) http://www.psc.edu/~mathis/
  • Wizard Gap
    • Working definition:
    • Ratio of effective network performance attained by an average user to that attainable by a network wizard ….
  • Fix the Network First
    • Three Steps to Performance
      • Clean the network
        • Pre-deployment
        • Monitoring
      • Model traffic
        • Application requirements for QoS/SLA
        • Monitoring for application performance
      • Deploy QoS
  • Lessons Learned
    • Guy Almes , chief engineer Abilene
      • “ The general consensus is that it's easier to fix a performance problem by host tuning and healthy provisioning rather than reserving. But it's understood that this may change over time. [...] For example, of the many performance problems being reported by users, very few are problems that would have been solved by QoS if we'd have had it.”
  • Tools
    • CAIDA Tools (Public)
      • http://www.caida.org/tools/
      • Taxonomies
        • Topology
        • Workload
        • Performance
        • Routing
        • Multicast
  • Recommended (Public) Tools
    • MRTG (SNMP-based router stats)
    • iPerf / NetPerf (active stress testing)
    • Ethereal/WireShark (passive sniffing)
    • NDT (TCP/UDP e2e active probing)
    • Argus (Flow-based traffic monitoring)
    • perfSonar (test/monitor infrastructure)
      • Including OWAMP, BWCTL(iPerf), etc.
  • Tools: OWAMP/BWCTL
    • OWAMP: one way active measurement protocol
      • Ping by any other name would smell as sweet
      • depends on stratum 1 time server at both ends
      • allows finding one way latency problems
    • BWCTL: Bandwidth control
      • management front end to iperf
      • prevent disruption of the network with iperf
  • Tools: BWCTL
    • Typical constraints to running “iperf”
    • Need software on all test systems
    • Need permissions on all systems involved (usually full shell accounts*)
    • Need to coordinate testing with others *
    • Need to run software on both sides with specified test parameters *
    • (* BWCTL was designed to help with these)
  • Tools: ARGUS
    • http:// www.qosient.com/argus
      • open source IP auditing tool
      • entirely passive
      • operates from network taps
      • network accounting down to the port level
  • Traffic Summary from Argus
    • From: Wed Aug 25 5:59:00 2004 To: Thu Aug 26 5:59:00 2004
    • 18,972,261,362 Total 10,057,240,289 Out 8,915,021,073 In
    • aaa.bb.cc.ddd 6,064,683,683 Tot 5,009,199,711 Out 1,055,483,972 In
    • ww.www.ww.www 1,490,107,096 1,396,534,031 93,573,065
    • ww.www.ww.www:11003 1,490,107,096 1,396,534,031 93,573,065
    • xx.xx.xx.xxx 574,727,508 548,101,513 26,625,995
    • xx.xx.xx.xxx:6885 574,727,508 548,101,513 26,625,995
    • yy.yyy.yyy.yyy 545,320,698 519,392,671 25,928,027
    • yy.yyy.yyy.yyy:6884 545,320,698 519,392,671 25,928,027
    • zzz.zzz.zz.zzz 428,146,146 414,054,598 14,091,548
    • zzz.zzz.zz.zzz:6890 428,146,146 414,054,598 14,091,548
  • Tools: ARGUS
    • using ARGUS to identify retransmission type problems.
      • compare total packet size to application data size
    • full (complete packet including IP headers)
    • 12:59:06 d tcp tcp sfu_ip.port -> taiwan_ip.port 9217 18455 497718 27940870
    • app (application data bytes delivered to the user)
    • 12:59:06 d tcp tcp sfu_ip.port -> taiwan_ip.port 9217 18455 0 26944300
    • data transfer one way
      • acks back have no user data
  • Tools: ARGUS
    • compare to misconfigured IP stack
    • full:
    • 15:27:38 * tcp outside_ip.port -> sfu_ip.port 967 964 65885 119588
    • app:
    • 15:27:38 * tcp outside_ip.port -> sfu_ip.port 967 964 2051 55952
    • retransmit rate is constantly above %50
    • poor throughput
    • this should (and did) set off alarm bells
  • Tools: NDT
    • (Many thanks to Lixin Liu)
    • Test 1: 50% signal on 802.11G
    • WEB100 Enabled Statistics:
    • Checking for Middleboxes . . . . . . . . . . . . . . . . . . Done checking for firewalls . . . . . . . . . . . . . . . . . . . Done running 10s outbound test (client-to-server [C2S]) . . . . . 12.00Mb/s running 10s inbound test (server-to-client [S2C]) . . . . . . 13.90Mb/s
    • ------ Client System Details ------
    • OS data: Name =3D Windows XP, Architecture =3D x86, Version =3D 5.1 Java data: Vendor =3D Sun Microsystems Inc., Version =3D 1.5.0_11
  • Tools: NDT
    • ------ Web100 Detailed Analysis ------
    • 45 Mbps T3/DS3 link found.
    • Link set to Full Duplex mode
    • No network congestion discovered.
    • Good network cable(s) found
    • Normal duplex operation found.
    • Web100 reports the Round trip time =3D 13.09 msec; the Packet size =3D 1460= Bytes; and =
    • There were 63 packets retransmitted, 447 duplicate acks received, and 0 SAC= K blocks received The connection was idle 0 seconds (0%) of the time C2S throughput test: Packet queuing detected: 0.10% S2C throughput test: Packet queuing detected: 22.81% This connection is receiver limited 3.88% of the time.
    • This connection is network limited 95.87% of the time.
    • Web100 reports TCP negotiated the optional Performance Settings to: =
    • RFC 2018 Selective Acknowledgment: OFF
    • RFC 896 Nagle Algorithm: ON
    • RFC 3168 Explicit Congestion Notification: OFF RFC 1323 Time Stamping: OFF RFC 1323 Window Scaling: ON
  • Tools: NDT
    • Server 'sniffer.ucs.sfu.ca' is not behind a firewall. [Connection to the ep= hemeral port was successful] Client is not behind a firewall. [Connection to the ephemeral port was succ= essful] Packet size is preserved End-to-End Server IP addresses are preserved End-to-End Client IP addresses are preserved End-to-End
    • ... (lots of web100 stats removed!)
    • aspd: 0.00000
    • CWND-Limited: 4449.30
    • The theoretical network limit is 23.74 Mbps The NDT server has a 8192.0 KByte buffer which limits the throughput to 977=
    • 6.96 Mbps
    • Your PC/Workstation has a 63.0 KByte buffer which limits the throughput to =
    • 38.19 Mbps
    • The network based flow control limits the throughput to 38.29 Mbps
    • Client Data reports link is 'T3', Client Acks report link is 'T3'
    • Server Data reports link is 'OC-48', Server Acks report link is 'OC-12'
  • Tools: NetPerf
    • netperf on the same link.
      • available throughput less than max
    • liu@CLM ~
    • $ netperf -l 60 -H sniffer.ucs.sfu.ca -- -s 1048576 -S 1048576 -m 1048576 TCP STREAM TEST from CLM (0.0.0.0) port 0 AF_INET to sniffer.ucs.sfu.ca (14= 2.58.200.252) port 0 AF_INET
    • Recv Send Send
    • Socket Socket Message Elapsed
    • Size Size Size Time Throughput
    • bytes bytes bytes secs. 10^6bits/sec
    • 2097152 1048576 1048576 60.10 9.91
    • (second run)
    • 2097152 1048576 1048576 61.52 5.32
  • Tools: NDT
    • Test 3: 80% on 802.11A
    • WEB100 Enabled Statistics:
    • Checking for Middleboxes . . . . . . . . . . . . . . . . . . Done checking for firewalls . . . . . . . . . . . . . . . . . . . Done running 10s outbound test (client-to-server [C2S]) . . . . . 20.35Mb/s running 10s inbound test (server-to-client [S2C]) . . . . . . 20.61Mb/s
    • ...
    • The theoretical network limit is 26.7 Mbps The NDT server has a 8192.0 KByte buffer which limits the throughput to 993= 4.80 Mbps Your PC/Workstation has a 63.0 KByte buffer which limits the throughput to = 38.80 Mbps The network based flow control limits the throughput to 38.90 Mbps
    • Client Data reports link is 'T3', Client Acks report link is 'T3'
    • Server Data reports link is 'OC-48', Server Acks report link is 'OC-12'
  • Tools: NetPerf
    • iu@CLM ~
    • $ netperf -l 60 -H sniffer.ucs.sfu.ca -- -s 1048576 -S 1048576 -m 1048576 TCP STREAM TEST from CLM (0.0.0.0) port 0 AF_INET to sniffer.ucs.sfu.ca (14= 2.58.
    • 200.252) port 0 AF_INET
    • Recv Send Send
    • Socket Socket Message Elapsed
    • Size Size Size Time Throughput
    • bytes bytes bytes secs. 10^6bits/sec
    • 2097152 1048576 1048576 60.25 21.86
    • No one else using wireless on A (i.e. the case on a lightpath)
    • NetPerf gets full throughput unlike the G case
  • Tools: perfSONAR
    • Performance Middleware
      • perfSONAR is an international consortium in which Internet2 is a founder and leading participant
      • perfSONAR is a set of protocol standards for interoperability between measurement and monitoring systems
      • perfSONAR is a set of open source web services that can be mixed-and-matched and extended to create a performance monitoring framework
    • Design Goals:
      • Standards-based
      • Modular
      • Decentralized
      • Locally controlled
      • Open Source
      • Extensible
  • perfSONAR Integrates
    • Network measurement tools
    • Network measurement archives
    • Discovery
    • Authentication and authorization
    • Data manipulation
    • Resource protection
    • Topology
  • Performance Measurement: Project Phases
    • Phase 1: Tool Beacons (Today)
      • BWCTL (Complete), http://e2epi.internet2.edu/bwctl
      • OWAMP (Complete), http://e2epi.internet2.edu/owamp
      • NDT (Complete), http://e2epi.internet2.edu/ndt
    • Phase 2: Measurement Domain Support
      • General Measurement Infrastructure (Prototype in Progress)
      • Abilene Measurement Infrastructure Deployment (Complete), http://abilene.internet2.edu/observatory
    • Phase 3: Federation Support (Future)
      • AA (Prototype – optional AES key, policy file, limits file)
      • Discovery (Measurement Nodes, Databases) (Prototype – nearest NDT server, web page)
      • Test Request/Response Schema Support (Prototype – GGF NMWG Schema)
  • Implementation
    • Applications
      • bwctld daemon
      • bwctl client
    • Built upon protocol abstraction library
      • Supports one-off applications
      • Allows authentication/policy hooks to be incorporated
  • LIVE DEMOS
    • NDT
    • AppCritical
    • Q&A
  •  
  •  
  •  
  • Outline (REMOVE)
    • Set the stage – how bad is it (stats)
      • Some stats from industry and SFU
      • What kinds of problems are typical
    • Overview of contexts
      • LAN and campus
      • Core networks including MPLS and optical
      • Wireless
    • Methodologies – sniffing, flows, synthetic traffic, active probing
    • Recommended tools with examples and demo
  • Breakdown of Presentation (REMOVE)
    • Intro and overview (both)
    • Quick demos (both)
    • Stats and experience
      • Industry stats (Loki)
      • Campus experience (Peter)
    • Problem types
      • Seven Deadly Sins (Loki)
      • SFU/BCnet/CANARIE idiosyncrasies (Peter)
    • Context overview (Peter)
    • Methodologies overview (Loki)
    • Tools lists and recommended tools
    • Demos
  • Application Ecology
    • Paraphrasing ITU categories
      • Real-time
        • Jitter sensitive
        • Voice, video, collaborative
      • Synchronous/transactional
        • Response time (RTT) sensitive
        • Database, remote control
      • Data
        • Bandwidth sensitive
        • Transfer, backup/recover
      • Best-effort
        • Not sensitive