Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ThousandEyes at Network Field Day 12


Published on

On August 8th 2016, we presented at NFD 12 about our new Enterprise Agent deployment options, reverse path functionality, Endpoint Agent for end user monitoring and Internet Outage Detection.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

ThousandEyes at Network Field Day 12

  1. 1. 0 Network Intelligence Without Borders Mohit Lad CEO and Co-founder
  2. 2. 1 About ThousandEyes Founded by network experts; strong investor backing Relied on for critical operations by leading enterprises Recognized as an innovative new approach ThousandEyes delivers network intelligence into every network. 30 Fortune 500 5 top 5 SaaS Companies 4 top 6 US Banks
  3. 3. 2 When You Think of Network Troubleshooting
  4. 4. 3 Legacy Environments NY Branch HK Branch Datacenter • On-premises Apps • Users in branch offices over wired connections • MPLS backbone MPLS MPLS
  5. 5. 4 Internet Centric Environment • Adoption of Cloud Applications • Split-tunnel from branch offices • Direct Internet Connectivity between branch offices • Wireless becoming primary connectivity at branch offices • Remote Users accessing cloud applications directly NY Branch HK Branch Datacenter 0365 Internet
  6. 6. 5 ThousandEyes Cloud Agents NY Branch Datacenter 0365Internet
  7. 7. 6 ThousandEyes Enterprise Agents NY Branch Datacenter 0365Internet
  8. 8. 7 ThousandEyes Endpoint Agents NY Branch Datacenter 0365Internet
  9. 9. 8 Product Design Principles Intuitive & Effective UI Harness the Power of SaaS Innovative Data Collection & Analytics • Powerful visualizations to model complex data • UI design that is re-usable and scalable • Seamless support help • Minimal deployment effort • Auto-updates • Centralized configuration • Cross-customer data correlation and analysis • Easy data sharing between different customers • Measure black-box environments using active probing • Measure with minimum instrumentation
  10. 10. 9 • Tackling Hybrid Network Environments with Enterprise Agents – Nick Kephart • End to End Visibility with Endpoint Agent – Scott Cressman, Martin Dam • Internet Outage Detection – Ricardo Oliveira Rest of the Day
  11. 11. 10 Tackling Hybrid Network Environments with Enterprise Agents Nick Kephart
  12. 12. 11 Enterprise Agent: Internal Vantage Point Key Use Cases • Internet connectivity of ISP ingress and egress • WAN visibility between branches and data centers • Performance of web, voice and FTP application traffic NY Branch HK Branch Datacenter 0365 Internet
  13. 13. 12 Deploying Enterprise Agents • Locations with containerized monitoring and operations tools • For remote branches and stores with limited IT infrastructure • Branch and WAN routers (IOS XE 3.17+ on ASR 1000 and ISR 4000) New New New Virtual Appliance Docker Container Linux Package Intel NUC Installer Cisco IOS Virtual Container • Easily deployable across the enterprise WAN and data center
  14. 14. 13 Visualizing the Entire Network Path Highlights • Forward and reverse path (helpful for asymmetric routing) • Measure and locate changes in loss, latency and QoS in each direction • Also test UDP in addition to TCP
  15. 15. 14 End-to-End Visibility with Endpoint Agent Scott Cressman Martin Dam
  16. 16. 15 End User Visibility Challenges • Remote and traveling workers • SaaS deployments • LAN and WAN issues in satellite offices NY Branch HK Branch Datacenter 0365 Internet
  17. 17. 16 Today’s “Solutions”
  18. 18. 17 Enter ThousandEyes Endpoint Agent You can’t get this from any other monitoring solution, period. • Extends visibility to the end- user, in the office, at home, on-the-go • Troubleshoot individual user sessions with live performance data • Analyze trends across user populations, applications, geographies
  19. 19. 18 How Endpoint Agent Works Lightweight client software Windows 7+, Mac OS X 10.9+ Negligible resource consumption Typically <1% CPU, <40MB mem, <50MB disk Easy deployment via standard tools msi & pkg installers w/ auto-registration End-user & background components Browser plugin (Chrome & IE) & system service Always up-to-date Updates automatically, runs in the background WEB/APPLICATION Completion, availability, response time, page load waterfall NETWORK Loss, latency, jitter, failures, path visualization, wireless topology, VPN, proxy, Wi-Fi quality (live user sessions!) Browser-based web applications • Only collects data for domains you choose to monitor Data streamed instantly to ThousandEyes service
  20. 20. 19 Complete Visibility from End User to Application
  21. 21. Internet Outage Detection Ricardo Oliveira CTO and Co-founder
  22. 22. 21 The Problem Landscape • Lack of visibility to apps relying on the Internet {UC,S,I,P}aaS • Lack of visibility to wireless/remote/mobile users • Traditional NPM solutions design for static clients and on-prem apps – Packet capture – SNMP polling NY Branch HK Branch Datacenter 0365 Internet
  23. 23. 22 ThousandEyes Agents NY Branch Datacenter 0365Internet
  24. 24. 23 • Internet is a shared network – same event impacts multiple customers • Harness data from multiple customers for more accurate inference of problem • Drive more value to customers with knowledge of depth and breadth of problem Drive for Internet Outage Detection
  25. 25. 24 • Detect outages in ISPs and understand their impact both globally and as it relates to a specific customer Overview: Internet Outage Detection • See the global and account scope, as well as likely root cause of BGP reachability outages Traffic Outage Detection Routing Outage Detection
  26. 26. 25 1. Anonymized (http) traffic data is aggregated from all tests across the entire user base 2. Algorithms then look for patterns in path traces terminating in the same ISP 3. Exclude: noisy interfaces and networks not belonging to ISPs How Traffic Outage Detection Works New York Cloud Agent Boston Enterprise Agent Los Angeles Cloud Agent Level 3 in San Jose Cogent in Denver Salesforce Google NY Times Customer 2 Customer 1
  27. 27. 26 Traffic Outage Detection Account scope Global scope Severity and scope of the issue at this interface
  28. 28. 27 • ~ 170 affected interfaces / hour Traffic Outages All the Time
  29. 29. 28 Routing Outage Detection Aggregates reachability issues in routing data from 350 routers Global scope Account scope Root cause analysis
  30. 30. 29 • ~ 1.6k prefixes affected / hour Routing Outages All the Time
  31. 31. 30 Hurricane Electric route leak affecting AWS Trans-Atlantic issues in Level 3 – Tata and TISparkle issues with submarine cable – Hurricane Electric removed >500 prefixes Tata cable cut in Singapore affecting Dropbox Level 3, NTT routing issues affecting JIRA – Widespread issues in Telia’s network in Ashburn – Recent Major Outages Detected April 23 May 3 May 20 June 6 June 24 July 10 July 17
  32. 32. 31 Examples of Notable Outages
  33. 33. 32 1. Network Layer Issues in Telia in Ashburn Detected outage coincides with packet loss spikes Ashburn, VA is “ground zero” for this outage
  34. 34. 33 Specific Failure Points in Telia High severity and wide scope (Outages affecting at least 20 tests for a NA/EU interface are likely to be wide in scope) Terminal nodes in Telia
  35. 35. 34 2. Hurricane Electric Route Flap Detected outage coincides with spike in AS path changes Root cause analysis points to Hurricane Electric and Telx
  36. 36. 35 Route Flap by Hurricane Electric Hurricane Electric Routes flap from using HE to NTT, then back to HE
  37. 37. 36 Traffic Issues in Hurricane Electric Hurricane Electric
  38. 38. 37 3. NTT and Level 3 Routing Issues Affect JIRA JIRA saw 0% availability and 100% packet loss Most affected interfaces are in Ashburn, VA
  39. 39. 38 Traffic Terminating in NTT Traffic paths originally traversed Level 3 and NTT Traffic paths then change to traverse only NTT, terminating there
  40. 40. 39 JIRA’s /24 Prefix Becomes Unreachable As the primary upstream ISP, Level 3 is associated with the most affected routes Routes through upstream ISPs NTT and Level 3 all withdrawn
  41. 41. 40 Routers Begin Using Misconfigured /16 Prefix The backup /16 prefix directs to NTT, not JIRA’s network. This is why the traffic path changed to traverse only NTT, terminating there when JIRA’s IP couldn’t be found in NTT’s network.
  42. 42. 41 Traffic Outages @ Cloud • IaaS/PaaS (CDNs, hosting, DNS providers) • SaaS (+ app context) Routing Outages • Leaks and hijacks Outage Event Stream • Outage geo + topology maps • Alerts based on outage impact/location/type/etc What’s Next
  43. 43. 42 Outage Created by Level3 Flap
  44. 44. 43 • Look for purple indicators and the ‘Outage Detected’ dropdown when investigating issues—these indicate detected outages! • Use quick links or select specific nodes/ASes to see how paths have changed over time • Correlate data from the web, network and routing layers to analyze root cause • See our blogs and Knowledge Base articles for more info: – Blog on Traffic Outage Detection – – Blog on Routing Outage Detection – – Knowledge Base: Tips for Diagnosing Internet Outages
  45. 45. 44 Thank You @thousandeyes