Network Monitoring Shifting from Fire Fighting  to Fire Prevention Mike Pennacchi Network Protocol Specialists, LLC
Presenter: Mike Pennacchi Owner – Network Protocol Specialists, LLC Trainer and Network Troubleshooting Consultant
Where we are today Wait for user community to complain about performance Don’t maintain accurate documentation Don’t develop baseline of network utilization Don’t know which protocols are in use on our own networks Don’t have snapshots of applications working well
Why do we do this? Everyone loves firefighters!
What is good about being Firefighers? Helping people in their time of need Solving complex problems under strict timelines Easy to justify your existence in the organization You can let your user community deploy applications never intended to run on the WAN and save them when they fail
Why being a Firefighter is bad End up making reactive changes that may not best serve the long-term goals of the organization The network is your responsibility, not your customer’s responsibility Once the problem has occurred, the organization is incurring unnecessary costs
Where we need to be Fire Marshal
The Fire Marshal’s Role Inspect building plans prior to permitting Conduct inspections of the buildings as construct proceeds Periodically inspect buildings to ensure the are compliant with the local fire codes Review the findings of fires that have occurred and use this to help prevent fires in the future
Being a Network Fire Marshal Inspecting applications prior to deployment Monitoring the network for signs of performance impairments Validating that network changes are make as planned and work as advertised Establishing interoperability guidelines Using data collected to make improvements to the network
How do we become a Network Fire Marshal? Assess the existing network infrastructure starting at the bottom and working our way to the top Capture and review those applications that are working properly, understand their dependencies Carefully review new applications prior to deployment to determine their impact on the network, and the network’s impact on the application
Assessing the Network
Assessing the existing network It is important to start at the bottom of the OSI model and work our way up Think of the ABC’s of CPR Without a good airway, breathing is pointless Without breathing, circulation is not sending oxygenated blood to the cells that need it On the network we must ensure the lower layers are doing their job before we jump to the upper layers
Monitoring the lower layers Problems with the physical layer can be observed by monitoring the error statistics on managed switches One of the most prevalent  problems on most networks is still a mismatch in Ethernet duplex settings Duplex errors will show up as FCS/CRC and Alignment errors on the switches
Monitoring Utilization We have seen companies buy more bandwidth, thinking they were at 100% utilization, when in reality they were below 30% In other cases, companies thought they had more than enough and at 100% utilization
Places to Monitor #1 – Wide Area Network links #2 – Links between switches and routers #3 – Switch uplinks #4 – Key devices such as servers
How to Monitor Tools such as the Multi-Router Traffic Grapher (MRTG) are free and easy to install Many service providers provide online resources that show utilization and error rates for managed circuits
MRTG
Looking back a year
Monitoring the Network Layer Devices send Address Resolution Protocol (ARP) packets based on their IP Address and Subnet Mask By monitoring the network and looking for devices that are ARPing for the wrong range of addresses, we can detect devices that are misconfigured at the network layer
Monitoring the Transport Layer Tools such as Iperf (a free throughput tool) allow us to measure the TCP throughput between two devices on the network Measuring at this layer helps us determine if the network is able to transport data without packet loss from end to end We can also determine if parameter such as TCP Windowsize is properly configured
Iperf
Understanding Existing Applications
Capture those applications that are working properly It is very difficult to determine what is broken in an application when you have never seen it working properly By capturing an application while it is working well, it is possible to compare this good sample with a capture of the broken application in the future
Determining Dependencies As part of this application capture, it is important to determine the applications dependencies Which DNS servers is it using? Does it use Domain Authentication? Is it sending database calls to another server?
Monitoring Server Response We were at a client side performing a network health check As part of the health check, we always monitor the server response times Found that the DNS server was taking up to 5 seconds to respond to requests
Slow DNS Server
Slow DNS Server Turned out packets were being lost due to congestion going to the Internet Problem had existed for so long, the user community had accepted this was how long it took to get to sites on the Internet
Testing New Applications
Carefully review new applications Upon reaching the top of the OSI model, we are at the Application Layer Having gotten to this point, we know our infrastructure is good and we have put into place the tools necessary to ensure it stays this way Here we will discuss some approaches to accessing new applications
Try before you buy When buying a used automobile, most people would take it to a mechanic to have it checked out Unfortunately in the networking world, applications are often purchased without evaluating how they will perform on the purchaser’s network
Reactive Tools, Proactive Testing Many of the same tools for troubleshooting problems can be used to test applications prior to deployment The results of these tests can be passed on to the group purchasing the application to let them know how it should perform
Some of the tools WAN Emulation Used to simulate a wide area network in a controlled environment.  Used in troubleshooting for recreating actual network conditions Allows clients and servers to be in the same room, but act as if they are miles apart
WAN Emulation
Some of the tools Software Prediction Uses captured traffic to estimate the transaction time over a variety of network conditions In a troubleshooting situation, we use this to determine if the application is taking as long as we expect
Predicting Application Performance
Summary As with fires, most network problems can be prevented To be a good network fire marshal, you must understand your existing network You must be able to monitor your network and see when it is operating abnormally Lastly, prevent new applications from being deployed that not operate properly
Thank You

Network Monitoring Webcast

  • 1.
    Network Monitoring Shiftingfrom Fire Fighting to Fire Prevention Mike Pennacchi Network Protocol Specialists, LLC
  • 2.
    Presenter: Mike PennacchiOwner – Network Protocol Specialists, LLC Trainer and Network Troubleshooting Consultant
  • 3.
    Where we aretoday Wait for user community to complain about performance Don’t maintain accurate documentation Don’t develop baseline of network utilization Don’t know which protocols are in use on our own networks Don’t have snapshots of applications working well
  • 4.
    Why do wedo this? Everyone loves firefighters!
  • 5.
    What is goodabout being Firefighers? Helping people in their time of need Solving complex problems under strict timelines Easy to justify your existence in the organization You can let your user community deploy applications never intended to run on the WAN and save them when they fail
  • 6.
    Why being aFirefighter is bad End up making reactive changes that may not best serve the long-term goals of the organization The network is your responsibility, not your customer’s responsibility Once the problem has occurred, the organization is incurring unnecessary costs
  • 7.
    Where we needto be Fire Marshal
  • 8.
    The Fire Marshal’sRole Inspect building plans prior to permitting Conduct inspections of the buildings as construct proceeds Periodically inspect buildings to ensure the are compliant with the local fire codes Review the findings of fires that have occurred and use this to help prevent fires in the future
  • 9.
    Being a NetworkFire Marshal Inspecting applications prior to deployment Monitoring the network for signs of performance impairments Validating that network changes are make as planned and work as advertised Establishing interoperability guidelines Using data collected to make improvements to the network
  • 10.
    How do webecome a Network Fire Marshal? Assess the existing network infrastructure starting at the bottom and working our way to the top Capture and review those applications that are working properly, understand their dependencies Carefully review new applications prior to deployment to determine their impact on the network, and the network’s impact on the application
  • 11.
  • 12.
    Assessing the existingnetwork It is important to start at the bottom of the OSI model and work our way up Think of the ABC’s of CPR Without a good airway, breathing is pointless Without breathing, circulation is not sending oxygenated blood to the cells that need it On the network we must ensure the lower layers are doing their job before we jump to the upper layers
  • 13.
    Monitoring the lowerlayers Problems with the physical layer can be observed by monitoring the error statistics on managed switches One of the most prevalent problems on most networks is still a mismatch in Ethernet duplex settings Duplex errors will show up as FCS/CRC and Alignment errors on the switches
  • 14.
    Monitoring Utilization Wehave seen companies buy more bandwidth, thinking they were at 100% utilization, when in reality they were below 30% In other cases, companies thought they had more than enough and at 100% utilization
  • 15.
    Places to Monitor#1 – Wide Area Network links #2 – Links between switches and routers #3 – Switch uplinks #4 – Key devices such as servers
  • 16.
    How to MonitorTools such as the Multi-Router Traffic Grapher (MRTG) are free and easy to install Many service providers provide online resources that show utilization and error rates for managed circuits
  • 17.
  • 18.
  • 19.
    Monitoring the NetworkLayer Devices send Address Resolution Protocol (ARP) packets based on their IP Address and Subnet Mask By monitoring the network and looking for devices that are ARPing for the wrong range of addresses, we can detect devices that are misconfigured at the network layer
  • 20.
    Monitoring the TransportLayer Tools such as Iperf (a free throughput tool) allow us to measure the TCP throughput between two devices on the network Measuring at this layer helps us determine if the network is able to transport data without packet loss from end to end We can also determine if parameter such as TCP Windowsize is properly configured
  • 21.
  • 22.
  • 23.
    Capture those applicationsthat are working properly It is very difficult to determine what is broken in an application when you have never seen it working properly By capturing an application while it is working well, it is possible to compare this good sample with a capture of the broken application in the future
  • 24.
    Determining Dependencies Aspart of this application capture, it is important to determine the applications dependencies Which DNS servers is it using? Does it use Domain Authentication? Is it sending database calls to another server?
  • 25.
    Monitoring Server ResponseWe were at a client side performing a network health check As part of the health check, we always monitor the server response times Found that the DNS server was taking up to 5 seconds to respond to requests
  • 26.
  • 27.
    Slow DNS ServerTurned out packets were being lost due to congestion going to the Internet Problem had existed for so long, the user community had accepted this was how long it took to get to sites on the Internet
  • 28.
  • 29.
    Carefully review newapplications Upon reaching the top of the OSI model, we are at the Application Layer Having gotten to this point, we know our infrastructure is good and we have put into place the tools necessary to ensure it stays this way Here we will discuss some approaches to accessing new applications
  • 30.
    Try before youbuy When buying a used automobile, most people would take it to a mechanic to have it checked out Unfortunately in the networking world, applications are often purchased without evaluating how they will perform on the purchaser’s network
  • 31.
    Reactive Tools, ProactiveTesting Many of the same tools for troubleshooting problems can be used to test applications prior to deployment The results of these tests can be passed on to the group purchasing the application to let them know how it should perform
  • 32.
    Some of thetools WAN Emulation Used to simulate a wide area network in a controlled environment. Used in troubleshooting for recreating actual network conditions Allows clients and servers to be in the same room, but act as if they are miles apart
  • 33.
  • 34.
    Some of thetools Software Prediction Uses captured traffic to estimate the transaction time over a variety of network conditions In a troubleshooting situation, we use this to determine if the application is taking as long as we expect
  • 35.
  • 36.
    Summary As withfires, most network problems can be prevented To be a good network fire marshal, you must understand your existing network You must be able to monitor your network and see when it is operating abnormally Lastly, prevent new applications from being deployed that not operate properly
  • 37.