SlideShare a Scribd company logo
1 of 15
Download to read offline
Network Troubleshooting
In the Cloud: Tools,
Techniques, and Gotchas
AWS Bootcamp #8 – September 6, 2018
Sherry Wei, Founder & CTO
Neel Kamal, Head of Field Operations
Frank Cabri, VP Product Marketing
© 2017 AVIATRIX SYSTEMS, INC. | 2© 2017 AVIATRIX SYSTEMS, INC. | 2
• Introductions
• Understanding VPC Networking Elements
• Common Troubleshooting Scenarios
• Demo
• Q & A
Welcome & Agenda
SHERRY WEI
Founder & CTO
NEEL KAMAL
Head of Field Operations
FEATURED SPEAKERS
© 2017 AVIATRIX SYSTEMS, INC. | 3© 2017 AVIATRIX SYSTEMS, INC. | 3
Check Out More Bootcamps – Available On-Demand
www.aviatrix.com/bootcamps
© 2017 AVIATRIX SYSTEMS, INC. | 4© 2017 AVIATRIX SYSTEMS, INC. | 4
Network Problems Often Appear at the App Layer …
“My production app can’t reach
the on-prem database. It was
working yesterday. Can you fix
the network?”
“My instance is running but I
can’t reach the Internet. Is the
network down?”
“From my QA instance, I can no
longer SSH into production. You
need to fix the network fast!”
“VPN performance really sucks.
Joe moved to Japan and he’s
griping that remote access to
dev is way too slow.”
© 2017 AVIATRIX SYSTEMS, INC. | 5© 2017 AVIATRIX SYSTEMS, INC. | 5
… and Gets Progressively Harder as You Dig Deeper
“A customer’s route table
propagated to my cloud
environment and collided with
my CIDR range.”
“I hit a VGW limit on entries.
That led to a BGP crash. And
THAT brought down the entire
cloud network.”
“Internet-bound packets from
the production VPC are getting
dropped.”
“I can’t get any friggin’ trace
logs out of VGW!”
“A partner says that IPsec
connectivity keeps going up
and down.”
© 2017 AVIATRIX SYSTEMS, INC. | 6© 2017 AVIATRIX SYSTEMS, INC. | 6
IGW
NAT SERVICE/GATEWAY
ROUTING TABLES
(PCX/BGP/VGW)
NETWORK ACLs
SECURITY POLICIES
EC2
Understanding VPC Networking Elements
• All layers must work
correctly for the network to
work
• Proving the network is not
the problem requires
proving each layer is not
the problem
• Network issues can be at
any layer, but there is no
easy way to tell, making
root cause analysis difficult
• Number of layers involved
depends upon the
destination (example: EC2
to EC2 vs. EC2 to Internet)
• Each layer has its own scale
limitation
And Limitations…
© 2017 AVIATRIX SYSTEMS, INC. | 7© 2017 AVIATRIX SYSTEMS, INC. | 7
Troubleshooting | Common Connectivity Scenarios
3. VPC to On-Prem
2. EC2 to Internet
1. EC2 to EC2
4. VPC to VNET
(multicloud)
© 2017 AVIATRIX SYSTEMS, INC. | 8© 2017 AVIATRIX SYSTEMS, INC. | 8
What can go wrong?
• Security Group Policies – for example, ports are not open
• Network ACLs – for example, inbound port is open, outbound not
open (not stateful)
• Route Table – for example, human error and limitation on number
of entries
What Does AWS Provide Natively for Troubleshooting?
• Flow Log (minimal information)
• AWS X-Ray
What’s Missing?
• Tools to gather and compare both EC2 instance attributes (security, network ACLs and
route table entries) side by side
• Guardrails – validation prior to making updates to route tables
1. EC2 to EC2 – Network Troubleshooting
EC2EC2
© 2017 AVIATRIX SYSTEMS, INC. | 9© 2017 AVIATRIX SYSTEMS, INC. | 9
What can go wrong?
• Unable to see what URLs should be allowed & denied
• All Internet-bound egress traffic is getting blocked
• Security policy (EC2 level/NAT Gateway) exceeds max limit of 200
• My proxy cannot filter non HTTP/S traffic (e.g. SFTP)
What Does AWS Provide Natively for Troubleshooting?
• Flow Log (minimal information)
What’s Missing?
• Visualization – Reporting on allowed/denied URLs
• Alerting on URL access policy violations
• Egress traffic discovery
• Domain-level filtering
2. EC2 to Internet – Network Troubleshooting
EC2
Internet
© 2017 AVIATRIX SYSTEMS, INC. | 10© 2017 AVIATRIX SYSTEMS, INC. | 10
What can go wrong?
• Network connection (IPsec) is down (VGW or on prem router)
• Direct Connect / Internet goes down
• Mismatched Ipsec parameters
• Route table is misconfigured OR unwanted routes propagated by BGP
• Exceeded route table limits
• Poor performance (latency and/or throughput)
What Does AWS Provide Natively for Troubleshooting?
• VGW up/down status and number of routes
What is Missing?
• VGW is a black box – no trace logs
• No alerts for route table limit
• No error checking for route table entries
• Automation - guardrails for updating route tables; error checks
3. VPC to On-Prem – Network Troubleshooting
VPC
On-Premises
Data Center
Direct Connect
or Internet
© 2017 AVIATRIX SYSTEMS, INC. | 11© 2017 AVIATRIX SYSTEMS, INC. | 11
What can go wrong?
• Route table is misconfigured OR unwanted routes propagated by BGP
• Exceeded route table limits
• Poor performance (latency and/or throughput)
• Azure VNet or AWS VGW goes down/maintenance schedule
What Do AWS/Azure Provide Natively for Troubleshooting?
• VGW up/down status and number of routes
What is Missing?
• No trace logs for cloud provider gateways
• No alerts for route table limit
• No error checking for route table entries
• Automation - guardrails for updating route tables; error checks
4. VPC to VNet (Multicloud) – Network Troubleshooting
VNet
VPC
© 2017 AVIATRIX SYSTEMS, INC. | 12© 2017 AVIATRIX SYSTEMS, INC. | 12
A Consolidated View for Troubleshooting all Layers of AWS Networking
Demo: Aviatrix Controller
© 2017 AVIATRIX SYSTEMS, INC. | 13© 2017 AVIATRIX SYSTEMS, INC. | 13
• Today you have lots of log data … and no insight
• Coming soon: correlated log data, with suggested expert remediation
Coming Soon – Problem Identification and Insights
© 2017 AVIATRIX SYSTEMS, INC. | 14© 2017 AVIATRIX SYSTEMS, INC. | 14
• You’ll receive email w/ a
link to a replay and slides
• Take 5 minutes and start a
free 14-day trial ….
https://www.aviatrix.com/trial
• To view other bootcamps:
https://www.aviatrix.com/bootcamps
Next Steps with Aviatrix
Use the Chat widget to talk
live with a Solution Architect
Thank You!

More Related Content

What's hot

Seven Criteria for Building an AWS Global Transit Network
Seven Criteria for Building an AWS Global Transit NetworkSeven Criteria for Building an AWS Global Transit Network
Seven Criteria for Building an AWS Global Transit NetworkKhash Nakhostin
 
Understanding the New Enterprise Multi-Cloud Backbone for DevOps Engineers
Understanding the New Enterprise Multi-Cloud Backbone for DevOps EngineersUnderstanding the New Enterprise Multi-Cloud Backbone for DevOps Engineers
Understanding the New Enterprise Multi-Cloud Backbone for DevOps EngineersDevOps.com
 
How Intuit Monitors Connectivity to AWS
How Intuit Monitors Connectivity to AWS How Intuit Monitors Connectivity to AWS
How Intuit Monitors Connectivity to AWS ThousandEyes
 
CDN Performance at eBay from Thousandeyes Connect
CDN Performance at eBay from Thousandeyes ConnectCDN Performance at eBay from Thousandeyes Connect
CDN Performance at eBay from Thousandeyes ConnectThousandEyes
 
Cisco IT and ThousandEyes
Cisco IT and ThousandEyesCisco IT and ThousandEyes
Cisco IT and ThousandEyesThousandEyes
 
Network monitoring for the modern wan webinar
Network monitoring for the modern wan webinarNetwork monitoring for the modern wan webinar
Network monitoring for the modern wan webinarThousandEyes
 
WWT: NFV Solutions Presentation from Cisco Live 2017
WWT: NFV Solutions Presentation from Cisco Live 2017WWT: NFV Solutions Presentation from Cisco Live 2017
WWT: NFV Solutions Presentation from Cisco Live 2017World Wide Technology
 
VPC and Datacenter Connectivity Options
VPC and Datacenter Connectivity OptionsVPC and Datacenter Connectivity Options
VPC and Datacenter Connectivity Optionsjohn homer alvero
 
How ThousandEyes Helps Atlassian Operate in the Public Cloud
How ThousandEyes Helps Atlassian Operate in the Public Cloud How ThousandEyes Helps Atlassian Operate in the Public Cloud
How ThousandEyes Helps Atlassian Operate in the Public Cloud ThousandEyes
 
Automating Performance Monitoring at Microsoft
Automating Performance Monitoring at MicrosoftAutomating Performance Monitoring at Microsoft
Automating Performance Monitoring at MicrosoftThousandEyes
 
Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012
Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012
Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012Kemp
 
Layer 7 Observability and Centralized Configuration with Consul Service Mesh
Layer 7 Observability and Centralized Configuration with Consul Service MeshLayer 7 Observability and Centralized Configuration with Consul Service Mesh
Layer 7 Observability and Centralized Configuration with Consul Service MeshMitchell Pronschinske
 
Getting Started with ThousandEyes
Getting Started with ThousandEyesGetting Started with ThousandEyes
Getting Started with ThousandEyesThousandEyes
 
Modern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud World
Modern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud WorldModern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud World
Modern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud WorldItential
 
Reverse Path Visibility with Agent-to-Agent Tests
Reverse Path Visibility with Agent-to-Agent TestsReverse Path Visibility with Agent-to-Agent Tests
Reverse Path Visibility with Agent-to-Agent TestsThousandEyes
 
Monitoring End User Experience with Endpoint Agent
Monitoring End User Experience with Endpoint AgentMonitoring End User Experience with Endpoint Agent
Monitoring End User Experience with Endpoint AgentThousandEyes
 
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...apidays
 
AWS Meetup Nov 2015 - News Corp Presentation
AWS Meetup Nov 2015 - News Corp PresentationAWS Meetup Nov 2015 - News Corp Presentation
AWS Meetup Nov 2015 - News Corp PresentationPolarSeven Pty Ltd
 
Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...
Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...
Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...Kemp
 

What's hot (20)

Seven Criteria for Building an AWS Global Transit Network
Seven Criteria for Building an AWS Global Transit NetworkSeven Criteria for Building an AWS Global Transit Network
Seven Criteria for Building an AWS Global Transit Network
 
Understanding the New Enterprise Multi-Cloud Backbone for DevOps Engineers
Understanding the New Enterprise Multi-Cloud Backbone for DevOps EngineersUnderstanding the New Enterprise Multi-Cloud Backbone for DevOps Engineers
Understanding the New Enterprise Multi-Cloud Backbone for DevOps Engineers
 
How Intuit Monitors Connectivity to AWS
How Intuit Monitors Connectivity to AWS How Intuit Monitors Connectivity to AWS
How Intuit Monitors Connectivity to AWS
 
CDN Performance at eBay from Thousandeyes Connect
CDN Performance at eBay from Thousandeyes ConnectCDN Performance at eBay from Thousandeyes Connect
CDN Performance at eBay from Thousandeyes Connect
 
Demystifying Service Mesh
Demystifying Service MeshDemystifying Service Mesh
Demystifying Service Mesh
 
Cisco IT and ThousandEyes
Cisco IT and ThousandEyesCisco IT and ThousandEyes
Cisco IT and ThousandEyes
 
Network monitoring for the modern wan webinar
Network monitoring for the modern wan webinarNetwork monitoring for the modern wan webinar
Network monitoring for the modern wan webinar
 
WWT: NFV Solutions Presentation from Cisco Live 2017
WWT: NFV Solutions Presentation from Cisco Live 2017WWT: NFV Solutions Presentation from Cisco Live 2017
WWT: NFV Solutions Presentation from Cisco Live 2017
 
VPC and Datacenter Connectivity Options
VPC and Datacenter Connectivity OptionsVPC and Datacenter Connectivity Options
VPC and Datacenter Connectivity Options
 
How ThousandEyes Helps Atlassian Operate in the Public Cloud
How ThousandEyes Helps Atlassian Operate in the Public Cloud How ThousandEyes Helps Atlassian Operate in the Public Cloud
How ThousandEyes Helps Atlassian Operate in the Public Cloud
 
Automating Performance Monitoring at Microsoft
Automating Performance Monitoring at MicrosoftAutomating Performance Monitoring at Microsoft
Automating Performance Monitoring at Microsoft
 
Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012
Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012
Microsoft DirectAccess Remote Access (VPN) with Windows 10 and Server 2012
 
Layer 7 Observability and Centralized Configuration with Consul Service Mesh
Layer 7 Observability and Centralized Configuration with Consul Service MeshLayer 7 Observability and Centralized Configuration with Consul Service Mesh
Layer 7 Observability and Centralized Configuration with Consul Service Mesh
 
Getting Started with ThousandEyes
Getting Started with ThousandEyesGetting Started with ThousandEyes
Getting Started with ThousandEyes
 
Modern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud World
Modern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud WorldModern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud World
Modern Network Compliance: Achieving Compliance in a Hybrid, Multi-Cloud World
 
Reverse Path Visibility with Agent-to-Agent Tests
Reverse Path Visibility with Agent-to-Agent TestsReverse Path Visibility with Agent-to-Agent Tests
Reverse Path Visibility with Agent-to-Agent Tests
 
Monitoring End User Experience with Endpoint Agent
Monitoring End User Experience with Endpoint AgentMonitoring End User Experience with Endpoint Agent
Monitoring End User Experience with Endpoint Agent
 
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
 
AWS Meetup Nov 2015 - News Corp Presentation
AWS Meetup Nov 2015 - News Corp PresentationAWS Meetup Nov 2015 - News Corp Presentation
AWS Meetup Nov 2015 - News Corp Presentation
 
Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...
Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...
Enhanced Multisite Site Selection for Windows 10 and DirectAccess with KEMP L...
 

Similar to Network Troubleshooting in the Cloud: Tools, Techniques and Gotchas

BRKSEC-3771 - WSA with wccp.pdf
BRKSEC-3771 - WSA with wccp.pdfBRKSEC-3771 - WSA with wccp.pdf
BRKSEC-3771 - WSA with wccp.pdfMenakaDevi14
 
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...kreuzwerker GmbH
 
The Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and TakeawaysThe Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and TakeawaysThousandEyes
 
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptx
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptxEMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptx
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptxThousandEyes
 
The Hitchhiker’s Guide to Hybrid Connectivity
The Hitchhiker’s Guide to Hybrid ConnectivityThe Hitchhiker’s Guide to Hybrid Connectivity
The Hitchhiker’s Guide to Hybrid ConnectivityDaniel Toomey
 
The Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and TakeawaysThe Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and TakeawaysThousandEyes
 
Data Plane Matters! A Deep Dive and Demo on NGINX Service Mesh
Data Plane Matters! A Deep Dive and Demo on NGINX Service MeshData Plane Matters! A Deep Dive and Demo on NGINX Service Mesh
Data Plane Matters! A Deep Dive and Demo on NGINX Service MeshNGINX, Inc.
 
New ThousandEyes Product Features and Release Highlights: July 2023
New ThousandEyes Product Features and Release Highlights: July 2023New ThousandEyes Product Features and Release Highlights: July 2023
New ThousandEyes Product Features and Release Highlights: July 2023ThousandEyes
 
NET309_Best Practices for Securing an Amazon Virtual Private Cloud
NET309_Best Practices for Securing an Amazon Virtual Private CloudNET309_Best Practices for Securing an Amazon Virtual Private Cloud
NET309_Best Practices for Securing an Amazon Virtual Private CloudAmazon Web Services
 
Mastering the move
Mastering the moveMastering the move
Mastering the moveTrivadis
 
AWS Community Day - Amy Negrette - Gateways to Gateways
AWS Community Day - Amy Negrette - Gateways to GatewaysAWS Community Day - Amy Negrette - Gateways to Gateways
AWS Community Day - Amy Negrette - Gateways to GatewaysAWS Chicago
 
Discover the Power of ThousandEyes on Your Meraki MX
Discover the Power of ThousandEyes on Your Meraki MXDiscover the Power of ThousandEyes on Your Meraki MX
Discover the Power of ThousandEyes on Your Meraki MXThousandEyes
 
Next Generation DDoS Services – can we do this with NFV? - CF Chui
Next Generation DDoS Services – can we do this with NFV? - CF ChuiNext Generation DDoS Services – can we do this with NFV? - CF Chui
Next Generation DDoS Services – can we do this with NFV? - CF ChuiMyNOG
 
The Top Outages of 2023: Analyses and Takeaways
The Top Outages of 2023: Analyses and TakeawaysThe Top Outages of 2023: Analyses and Takeaways
The Top Outages of 2023: Analyses and TakeawaysThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
 
5 Best Practices for Building an AWS Global Transit Network
 5 Best Practices for Building an AWS Global Transit Network 5 Best Practices for Building an AWS Global Transit Network
5 Best Practices for Building an AWS Global Transit NetworkAmazon Web Services
 
Service Virtualization: What Testers Need to Know
Service Virtualization: What Testers Need to KnowService Virtualization: What Testers Need to Know
Service Virtualization: What Testers Need to KnowTechWell
 
Introduction to Windows Azure Service Bus Relay Service
Introduction to Windows Azure Service Bus Relay ServiceIntroduction to Windows Azure Service Bus Relay Service
Introduction to Windows Azure Service Bus Relay ServiceTamir Dresher
 
Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)Christian Posta
 
Deep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenariosDeep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenariosSajith C P Nair
 

Similar to Network Troubleshooting in the Cloud: Tools, Techniques and Gotchas (20)

BRKSEC-3771 - WSA with wccp.pdf
BRKSEC-3771 - WSA with wccp.pdfBRKSEC-3771 - WSA with wccp.pdf
BRKSEC-3771 - WSA with wccp.pdf
 
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
 
The Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and TakeawaysThe Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and Takeaways
 
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptx
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptxEMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptx
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptx
 
The Hitchhiker’s Guide to Hybrid Connectivity
The Hitchhiker’s Guide to Hybrid ConnectivityThe Hitchhiker’s Guide to Hybrid Connectivity
The Hitchhiker’s Guide to Hybrid Connectivity
 
The Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and TakeawaysThe Top Outages of 2022: Analysis and Takeaways
The Top Outages of 2022: Analysis and Takeaways
 
Data Plane Matters! A Deep Dive and Demo on NGINX Service Mesh
Data Plane Matters! A Deep Dive and Demo on NGINX Service MeshData Plane Matters! A Deep Dive and Demo on NGINX Service Mesh
Data Plane Matters! A Deep Dive and Demo on NGINX Service Mesh
 
New ThousandEyes Product Features and Release Highlights: July 2023
New ThousandEyes Product Features and Release Highlights: July 2023New ThousandEyes Product Features and Release Highlights: July 2023
New ThousandEyes Product Features and Release Highlights: July 2023
 
NET309_Best Practices for Securing an Amazon Virtual Private Cloud
NET309_Best Practices for Securing an Amazon Virtual Private CloudNET309_Best Practices for Securing an Amazon Virtual Private Cloud
NET309_Best Practices for Securing an Amazon Virtual Private Cloud
 
Mastering the move
Mastering the moveMastering the move
Mastering the move
 
AWS Community Day - Amy Negrette - Gateways to Gateways
AWS Community Day - Amy Negrette - Gateways to GatewaysAWS Community Day - Amy Negrette - Gateways to Gateways
AWS Community Day - Amy Negrette - Gateways to Gateways
 
Discover the Power of ThousandEyes on Your Meraki MX
Discover the Power of ThousandEyes on Your Meraki MXDiscover the Power of ThousandEyes on Your Meraki MX
Discover the Power of ThousandEyes on Your Meraki MX
 
Next Generation DDoS Services – can we do this with NFV? - CF Chui
Next Generation DDoS Services – can we do this with NFV? - CF ChuiNext Generation DDoS Services – can we do this with NFV? - CF Chui
Next Generation DDoS Services – can we do this with NFV? - CF Chui
 
The Top Outages of 2023: Analyses and Takeaways
The Top Outages of 2023: Analyses and TakeawaysThe Top Outages of 2023: Analyses and Takeaways
The Top Outages of 2023: Analyses and Takeaways
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
 
5 Best Practices for Building an AWS Global Transit Network
 5 Best Practices for Building an AWS Global Transit Network 5 Best Practices for Building an AWS Global Transit Network
5 Best Practices for Building an AWS Global Transit Network
 
Service Virtualization: What Testers Need to Know
Service Virtualization: What Testers Need to KnowService Virtualization: What Testers Need to Know
Service Virtualization: What Testers Need to Know
 
Introduction to Windows Azure Service Bus Relay Service
Introduction to Windows Azure Service Bus Relay ServiceIntroduction to Windows Azure Service Bus Relay Service
Introduction to Windows Azure Service Bus Relay Service
 
Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)Kubernetes Ingress to Service Mesh (and beyond!)
Kubernetes Ingress to Service Mesh (and beyond!)
 
Deep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenariosDeep Dive - Usage of on premises data gateway for hybrid integration scenarios
Deep Dive - Usage of on premises data gateway for hybrid integration scenarios
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Network Troubleshooting in the Cloud: Tools, Techniques and Gotchas

  • 1. Network Troubleshooting In the Cloud: Tools, Techniques, and Gotchas AWS Bootcamp #8 – September 6, 2018 Sherry Wei, Founder & CTO Neel Kamal, Head of Field Operations Frank Cabri, VP Product Marketing
  • 2. © 2017 AVIATRIX SYSTEMS, INC. | 2© 2017 AVIATRIX SYSTEMS, INC. | 2 • Introductions • Understanding VPC Networking Elements • Common Troubleshooting Scenarios • Demo • Q & A Welcome & Agenda SHERRY WEI Founder & CTO NEEL KAMAL Head of Field Operations FEATURED SPEAKERS
  • 3. © 2017 AVIATRIX SYSTEMS, INC. | 3© 2017 AVIATRIX SYSTEMS, INC. | 3 Check Out More Bootcamps – Available On-Demand www.aviatrix.com/bootcamps
  • 4. © 2017 AVIATRIX SYSTEMS, INC. | 4© 2017 AVIATRIX SYSTEMS, INC. | 4 Network Problems Often Appear at the App Layer … “My production app can’t reach the on-prem database. It was working yesterday. Can you fix the network?” “My instance is running but I can’t reach the Internet. Is the network down?” “From my QA instance, I can no longer SSH into production. You need to fix the network fast!” “VPN performance really sucks. Joe moved to Japan and he’s griping that remote access to dev is way too slow.”
  • 5. © 2017 AVIATRIX SYSTEMS, INC. | 5© 2017 AVIATRIX SYSTEMS, INC. | 5 … and Gets Progressively Harder as You Dig Deeper “A customer’s route table propagated to my cloud environment and collided with my CIDR range.” “I hit a VGW limit on entries. That led to a BGP crash. And THAT brought down the entire cloud network.” “Internet-bound packets from the production VPC are getting dropped.” “I can’t get any friggin’ trace logs out of VGW!” “A partner says that IPsec connectivity keeps going up and down.”
  • 6. © 2017 AVIATRIX SYSTEMS, INC. | 6© 2017 AVIATRIX SYSTEMS, INC. | 6 IGW NAT SERVICE/GATEWAY ROUTING TABLES (PCX/BGP/VGW) NETWORK ACLs SECURITY POLICIES EC2 Understanding VPC Networking Elements • All layers must work correctly for the network to work • Proving the network is not the problem requires proving each layer is not the problem • Network issues can be at any layer, but there is no easy way to tell, making root cause analysis difficult • Number of layers involved depends upon the destination (example: EC2 to EC2 vs. EC2 to Internet) • Each layer has its own scale limitation And Limitations…
  • 7. © 2017 AVIATRIX SYSTEMS, INC. | 7© 2017 AVIATRIX SYSTEMS, INC. | 7 Troubleshooting | Common Connectivity Scenarios 3. VPC to On-Prem 2. EC2 to Internet 1. EC2 to EC2 4. VPC to VNET (multicloud)
  • 8. © 2017 AVIATRIX SYSTEMS, INC. | 8© 2017 AVIATRIX SYSTEMS, INC. | 8 What can go wrong? • Security Group Policies – for example, ports are not open • Network ACLs – for example, inbound port is open, outbound not open (not stateful) • Route Table – for example, human error and limitation on number of entries What Does AWS Provide Natively for Troubleshooting? • Flow Log (minimal information) • AWS X-Ray What’s Missing? • Tools to gather and compare both EC2 instance attributes (security, network ACLs and route table entries) side by side • Guardrails – validation prior to making updates to route tables 1. EC2 to EC2 – Network Troubleshooting EC2EC2
  • 9. © 2017 AVIATRIX SYSTEMS, INC. | 9© 2017 AVIATRIX SYSTEMS, INC. | 9 What can go wrong? • Unable to see what URLs should be allowed & denied • All Internet-bound egress traffic is getting blocked • Security policy (EC2 level/NAT Gateway) exceeds max limit of 200 • My proxy cannot filter non HTTP/S traffic (e.g. SFTP) What Does AWS Provide Natively for Troubleshooting? • Flow Log (minimal information) What’s Missing? • Visualization – Reporting on allowed/denied URLs • Alerting on URL access policy violations • Egress traffic discovery • Domain-level filtering 2. EC2 to Internet – Network Troubleshooting EC2 Internet
  • 10. © 2017 AVIATRIX SYSTEMS, INC. | 10© 2017 AVIATRIX SYSTEMS, INC. | 10 What can go wrong? • Network connection (IPsec) is down (VGW or on prem router) • Direct Connect / Internet goes down • Mismatched Ipsec parameters • Route table is misconfigured OR unwanted routes propagated by BGP • Exceeded route table limits • Poor performance (latency and/or throughput) What Does AWS Provide Natively for Troubleshooting? • VGW up/down status and number of routes What is Missing? • VGW is a black box – no trace logs • No alerts for route table limit • No error checking for route table entries • Automation - guardrails for updating route tables; error checks 3. VPC to On-Prem – Network Troubleshooting VPC On-Premises Data Center Direct Connect or Internet
  • 11. © 2017 AVIATRIX SYSTEMS, INC. | 11© 2017 AVIATRIX SYSTEMS, INC. | 11 What can go wrong? • Route table is misconfigured OR unwanted routes propagated by BGP • Exceeded route table limits • Poor performance (latency and/or throughput) • Azure VNet or AWS VGW goes down/maintenance schedule What Do AWS/Azure Provide Natively for Troubleshooting? • VGW up/down status and number of routes What is Missing? • No trace logs for cloud provider gateways • No alerts for route table limit • No error checking for route table entries • Automation - guardrails for updating route tables; error checks 4. VPC to VNet (Multicloud) – Network Troubleshooting VNet VPC
  • 12. © 2017 AVIATRIX SYSTEMS, INC. | 12© 2017 AVIATRIX SYSTEMS, INC. | 12 A Consolidated View for Troubleshooting all Layers of AWS Networking Demo: Aviatrix Controller
  • 13. © 2017 AVIATRIX SYSTEMS, INC. | 13© 2017 AVIATRIX SYSTEMS, INC. | 13 • Today you have lots of log data … and no insight • Coming soon: correlated log data, with suggested expert remediation Coming Soon – Problem Identification and Insights
  • 14. © 2017 AVIATRIX SYSTEMS, INC. | 14© 2017 AVIATRIX SYSTEMS, INC. | 14 • You’ll receive email w/ a link to a replay and slides • Take 5 minutes and start a free 14-day trial …. https://www.aviatrix.com/trial • To view other bootcamps: https://www.aviatrix.com/bootcamps Next Steps with Aviatrix Use the Chat widget to talk live with a Solution Architect