Network Monitoring Webcast


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Network Monitoring Webcast

  1. 1. Network Monitoring Shifting from Fire Fighting to Fire Prevention Mike Pennacchi Network Protocol Specialists, LLC
  2. 2. Presenter: <ul><li>Mike Pennacchi </li></ul><ul><li>Owner – Network Protocol Specialists, LLC </li></ul><ul><li>Trainer and Network Troubleshooting Consultant </li></ul>
  3. 3. Where we are today <ul><li>Wait for user community to complain about performance </li></ul><ul><li>Don’t maintain accurate documentation </li></ul><ul><li>Don’t develop baseline of network utilization </li></ul><ul><li>Don’t know which protocols are in use on our own networks </li></ul><ul><li>Don’t have snapshots of applications working well </li></ul>
  4. 4. Why do we do this? <ul><li>Everyone loves firefighters! </li></ul>
  5. 5. What is good about being Firefighers? <ul><li>Helping people in their time of need </li></ul><ul><li>Solving complex problems under strict timelines </li></ul><ul><li>Easy to justify your existence in the organization </li></ul><ul><li>You can let your user community deploy applications never intended to run on the WAN and save them when they fail </li></ul>
  6. 6. Why being a Firefighter is bad <ul><li>End up making reactive changes that may not best serve the long-term goals of the organization </li></ul><ul><li>The network is your responsibility, not your customer’s responsibility </li></ul><ul><li>Once the problem has occurred, the organization is incurring unnecessary costs </li></ul>
  7. 7. Where we need to be <ul><li>Fire Marshal </li></ul>
  8. 8. The Fire Marshal’s Role <ul><li>Inspect building plans prior to permitting </li></ul><ul><li>Conduct inspections of the buildings as construct proceeds </li></ul><ul><li>Periodically inspect buildings to ensure the are compliant with the local fire codes </li></ul><ul><li>Review the findings of fires that have occurred and use this to help prevent fires in the future </li></ul>
  9. 9. Being a Network Fire Marshal <ul><li>Inspecting applications prior to deployment </li></ul><ul><li>Monitoring the network for signs of performance impairments </li></ul><ul><li>Validating that network changes are make as planned and work as advertised </li></ul><ul><li>Establishing interoperability guidelines </li></ul><ul><li>Using data collected to make improvements to the network </li></ul>
  10. 10. How do we become a Network Fire Marshal? <ul><li>Assess the existing network infrastructure starting at the bottom and working our way to the top </li></ul><ul><li>Capture and review those applications that are working properly, understand their dependencies </li></ul><ul><li>Carefully review new applications prior to deployment to determine their impact on the network, and the network’s impact on the application </li></ul>
  11. 11. Assessing the Network
  12. 12. Assessing the existing network <ul><li>It is important to start at the bottom of the OSI model and work our way up </li></ul><ul><li>Think of the ABC’s of CPR </li></ul><ul><ul><li>Without a good airway, breathing is pointless </li></ul></ul><ul><ul><li>Without breathing, circulation is not sending oxygenated blood to the cells that need it </li></ul></ul><ul><li>On the network we must ensure the lower layers are doing their job before we jump to the upper layers </li></ul>
  13. 13. Monitoring the lower layers <ul><li>Problems with the physical layer can be observed by monitoring the error statistics on managed switches </li></ul><ul><li>One of the most prevalent problems on most networks is still a mismatch in Ethernet duplex settings </li></ul><ul><li>Duplex errors will show up as FCS/CRC and Alignment errors on the switches </li></ul>
  14. 14. Monitoring Utilization <ul><li>We have seen companies buy more bandwidth, thinking they were at 100% utilization, when in reality they were below 30% </li></ul><ul><li>In other cases, companies thought they had more than enough and at 100% utilization </li></ul>
  15. 15. Places to Monitor <ul><li>#1 – Wide Area Network links </li></ul><ul><li>#2 – Links between switches and routers </li></ul><ul><li>#3 – Switch uplinks </li></ul><ul><li>#4 – Key devices such as servers </li></ul>
  16. 16. How to Monitor <ul><li>Tools such as the Multi-Router Traffic Grapher (MRTG) are free and easy to install </li></ul><ul><li>Many service providers provide online resources that show utilization and error rates for managed circuits </li></ul>
  17. 17. MRTG
  18. 18. Looking back a year
  19. 19. Monitoring the Network Layer <ul><li>Devices send Address Resolution Protocol (ARP) packets based on their IP Address and Subnet Mask </li></ul><ul><li>By monitoring the network and looking for devices that are ARPing for the wrong range of addresses, we can detect devices that are misconfigured at the network layer </li></ul>
  20. 20. Monitoring the Transport Layer <ul><li>Tools such as Iperf (a free throughput tool) allow us to measure the TCP throughput between two devices on the network </li></ul><ul><li>Measuring at this layer helps us determine if the network is able to transport data without packet loss from end to end </li></ul><ul><li>We can also determine if parameter such as TCP Windowsize is properly configured </li></ul>
  21. 21. Iperf
  22. 22. Understanding Existing Applications
  23. 23. Capture those applications that are working properly <ul><li>It is very difficult to determine what is broken in an application when you have never seen it working properly </li></ul><ul><li>By capturing an application while it is working well, it is possible to compare this good sample with a capture of the broken application in the future </li></ul>
  24. 24. Determining Dependencies <ul><li>As part of this application capture, it is important to determine the applications dependencies </li></ul><ul><li>Which DNS servers is it using? </li></ul><ul><li>Does it use Domain Authentication? </li></ul><ul><li>Is it sending database calls to another server? </li></ul>
  25. 25. Monitoring Server Response <ul><li>We were at a client side performing a network health check </li></ul><ul><li>As part of the health check, we always monitor the server response times </li></ul><ul><li>Found that the DNS server was taking up to 5 seconds to respond to requests </li></ul>
  26. 26. Slow DNS Server
  27. 27. Slow DNS Server <ul><li>Turned out packets were being lost due to congestion going to the Internet </li></ul><ul><li>Problem had existed for so long, the user community had accepted this was how long it took to get to sites on the Internet </li></ul>
  28. 28. Testing New Applications
  29. 29. Carefully review new applications <ul><li>Upon reaching the top of the OSI model, we are at the Application Layer </li></ul><ul><li>Having gotten to this point, we know our infrastructure is good and we have put into place the tools necessary to ensure it stays this way </li></ul><ul><li>Here we will discuss some approaches to accessing new applications </li></ul>
  30. 30. Try before you buy <ul><li>When buying a used automobile, most people would take it to a mechanic to have it checked out </li></ul><ul><li>Unfortunately in the networking world, applications are often purchased without evaluating how they will perform on the purchaser’s network </li></ul>
  31. 31. Reactive Tools, Proactive Testing <ul><li>Many of the same tools for troubleshooting problems can be used to test applications prior to deployment </li></ul><ul><li>The results of these tests can be passed on to the group purchasing the application to let them know how it should perform </li></ul>
  32. 32. Some of the tools <ul><li>WAN Emulation </li></ul><ul><ul><li>Used to simulate a wide area network in a controlled environment. Used in troubleshooting for recreating actual network conditions </li></ul></ul><ul><ul><li>Allows clients and servers to be in the same room, but act as if they are miles apart </li></ul></ul>
  33. 33. WAN Emulation
  34. 34. Some of the tools <ul><li>Software Prediction </li></ul><ul><ul><li>Uses captured traffic to estimate the transaction time over a variety of network conditions </li></ul></ul><ul><ul><li>In a troubleshooting situation, we use this to determine if the application is taking as long as we expect </li></ul></ul>
  35. 35. Predicting Application Performance
  36. 36. Summary <ul><li>As with fires, most network problems can be prevented </li></ul><ul><li>To be a good network fire marshal, you must understand your existing network </li></ul><ul><li>You must be able to monitor your network and see when it is operating abnormally </li></ul><ul><li>Lastly, prevent new applications from being deployed that not operate properly </li></ul>
  37. 37. Thank You