SlideShare a Scribd company logo
1 of 24
1
© 1992–2023 Cisco Systems, Inc. All rights reserved.
2
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Featured Speaker
Mike Hicks
Principal Solutions Analyst
3
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Before We Begin...
• If you have any questions, please type them in the Questions window.
• If you have any audio problems, please chat us for help.
• A recording of this presentation will be sent to you in a few days.
3
@ThousandEyes
© 1992–2023 Cisco Systems, Inc. All rights reserved.
4
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Agenda
• About ThousandEyes
• Noteworthy Outages of 2022
• Primer: Digital Service Building Blocks
• Top Ten Outage Countdown
• Lessons & Takeaways
• Q&A
4
@ThousandEyes
5
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Actionable Insight for Internet, Cloud, and SaaS
Correlated Insights
Quickly isolate issues to app, network,
or service
Network Visibility
Overlay, hop-by-hop underlay, ISP
performance, and BGP routing
App Experience
SaaS, API, and internal app
performance and user experience
6
© 1992–2023 Cisco Systems, Inc. All rights reserved.
2022 Noteworthy Outages
Major
Significant
Shadow
British
Airways
(2/25)
Twitter
prefixes
hijacked
(3/28)
Atlassian
services
unavailable
(4/5)
Rogers
routing
failure
(7/8)
AWS AZ
Failure
(8/9)
Zoom
Outage
(9/15)
Zscaler
Internet
Access
Failure
(10/25)
WhatsApp
Outage
(10/25)
AWS
packet
loss
(12/5)
7
© 1992–2023 Cisco Systems, Inc. All rights reserved.
CDN
Cloud
BGP
DNS
The Building Blocks of Today’s Digital Services
SaaS
8
© 1992–2023 Cisco Systems, Inc. All rights reserved.
DNS
BGP
Many Options, Complex Dependencies
ISP
Users
CDN
Your App
Security
9
© 1992–2023 Cisco Systems, Inc. All rights reserved.
DNS
BGP
Many Options, Complex Dependencies
ISP
Users
CDN
Your App
Cloud APIs
Data Center
Cloud IaaS
Security
10
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 1: DNS – Where are We Going?
Users CDN Your App
BGP
ISP
DNS
Root Server
TLD Server
Authoritative
Server
11
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 2: How do We Get There?
Users BGP
ISP
DNS CDN Your App
12
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 3: CDNs - Do We Have to Travel So Far?
Users Your App
CDN
BGP
ISP
DNS
13
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 4: Rinse and Repeat For Services & API Calls
Your App
SaaS Apps
Cloud APIs
Data
Center
Backend
Services
Top Ten Countdown
15
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Atlassian, Apr 5, 2022
#9
#8
#10
#7
#6
Zscaler Internet Access, Oct 25, 2022
WhatsApp, Oct 25, 2022
AWS, Dec 5, 2022
Rogers, Jul 8, 2022
~24 hours
App + routing issues
~2.5 days
Service unavailable/data loss
Rogers withdrew its prefixes due to an internal routing issue,
rendering it unreachable across the Internet for nearly 24 hours.
Lesson: No provider is immune to outages. Plan for a backup
network provider that can alleviate the length and scope of an
outage.
Customers using Zscaler Internet Access (ZIA) experienced
connectivity failures or high latency in reaching Zscaler proxies.
Lesson: Having network-agnostic data for complex scenarios like
this can enable quicker attribution and remediation.
~30 minutes
Network traffic loss
~2 hours
Failure to send/receive messages
~1 hour
Network traffic/packet loss
Significant packet loss between 2 global locations and AWS' us-
east-2 region. Lesson: it’s important to monitor not just the
applications, but also the cloud infrastructure components and
any dependent cloud software services.
The two-hour outage left WhatsApp users unable to send or
receive messages. Lesson: A thriving SaaS business relies on
continuous improvement, which is why an immediate feedback
loop—whereby mistakes can be rectified quickly—is necessary.
Due to a maintenance script error, Atlassian services
experienced a days-long outage. Lesson: One cannot rely on
status pages alone to communicate about outages. Customers
can be left worrying with no answer as to how serious an outage
is and when it will be fixed.
Outage
Blog
Outage
Blog
16
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Zoom, September 15th, 2022
#5
• Service unavailable ~20
minutes
• Users were unable to
log in or join meetings
• Most of the HTTP errors
seen were 503 Bad
Gateway responses,
indicative of potential
CDN issues
• The service would
appear to be available if
just testing via IP, but
looking at HTTP
results/service status
tells a different story
Lesson: It may be that the app itself is causing issues rather than
the network. Having visibility into which it is can prevent confusion
and finger-pointing during root cause analysis.
17
© 1992–2023 Cisco Systems, Inc. All rights reserved.
British Airways, February 25, 2022
#4
• Service unavailable
~20 minutes
• Outage caused
hundreds of flight
cancellations and
disruptions in the
airline's operations
• Network paths to the
airline’s online services
(and servers) were
reachable, but server
and site responses
were timing out
Lesson: Architecting backends that avoid single points of failure
can reduce the likelihood of a chain of events
18
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Google, August 9, 2022
#3
• Service unavailable for
~60 minutes
• Outage affected Google
search and maps
• During this time, Google
web servers responded
with HTTP 500 Internal
Server Error messages,
502 bad gateway errors,
and timeouts
Lesson: It is important to monitor not just your application front
ends but also the performance-critical dependencies that power
your app. Outage Blog
19
© 1992–2023 Cisco Systems, Inc. All rights reserved.
AWS AZ Failure, July 28th, 2022
#2
• Service unavailable ~20
minutes, ~3 hours for
customers to recover
• Caused by an
Availability Zone power
failure
• Impacted applications
such as Webex, Okta,
and Splunk.
• Affected EC2 instances
and EBS volumes as
well as traffic routing
Lesson: Be sure to have redundant AZ architecture as
they are typically active/active and remove the need to
execute a backup plan. Outage Blog
20
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Twitter, March 28th, 2022
#1
• Service unavailable ~45
minutes
• Twitter was rendered
unreachable for some users
when JSC RTComm.RU
(AS 8342) announced one
of Twitter’s prefixes and
subsequently blackholed
traffic
• Since Twitter’s service is not
located within RTComm’s
network, any Twitter traffic
destined to RTComm would
have failed.
Lesson: Though your company might have RPKI implemented to
fend off BGP threats, it's possible that your telco won't. Something
to consider when selecting ISPs. Outage Blog
21
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Lessons and Takeaways
• BGP powers the Internet, but can also be misused and abused.
Visibility and planning is needed to protect your network.
• Public cloud is ubiquitous and reliable. But, ensure that you are
monitoring all cloud dependencies.
• Avoid single points of failure. Your apps are only as resilient as your
architecture.
• Security is essential, but it can add great complexity that requires
continuous end-to-end visibility.
• Whenever the infrastructure is touched, failures can occur. Visibility is
critical before and after each network change to avoid impacts.
© 1992–2023 Cisco Systems, Inc. All rights reserved. 22
@ThousandEyes
Learn
more
Free
Trial /
Demo
Next Steps
Copyright ©2023 ThousandEyes
• Subscribe! https://blog.thousandeyes.com
• Get a real-time view of the health of the Internet
https://thousandeyes.com/outages
• Sign up for a Free Trial:
https://www.thousandeyes.com/signup
• Request a demo:
https://www.thousandeyes.com/request-demo
Q&A
The Top Outages of 2022: Analysis and Takeaways

More Related Content

Similar to The Top Outages of 2022: Analysis and Takeaways

Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edition
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 EditionTakeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edition
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 EditionThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
Introduction To ThousandEyes
Introduction To ThousandEyesIntroduction To ThousandEyes
Introduction To ThousandEyesThousandEyes
 
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdfSaurabh Chauhan
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceOptimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceThousandEyes
 
Microsoft Outage Analysis
Microsoft Outage AnalysisMicrosoft Outage Analysis
Microsoft Outage AnalysisThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN ProjectsHow to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN ProjectsThousandEyes
 
What is ThousandEyes Webinar
What is ThousandEyes WebinarWhat is ThousandEyes Webinar
What is ThousandEyes WebinarThousandEyes
 
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceOptimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceThousandEyes
 
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceEMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceThousandEyes
 
Owning End-to-end Application Experience With ThousandEyes
Owning End-to-end Application Experience With ThousandEyesOwning End-to-end Application Experience With ThousandEyes
Owning End-to-end Application Experience With ThousandEyesThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
 
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...ThousandEyes
 
Cisco IT and ThousandEyes
Cisco IT and ThousandEyesCisco IT and ThousandEyes
Cisco IT and ThousandEyesThousandEyes
 
The Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and TakeawaysThe Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and TakeawaysThousandEyes
 

Similar to The Top Outages of 2022: Analysis and Takeaways (20)

Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edition
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 EditionTakeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edition
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edition
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
Introduction To ThousandEyes
Introduction To ThousandEyesIntroduction To ThousandEyes
Introduction To ThousandEyes
 
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceOptimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
 
Microsoft Outage Analysis
Microsoft Outage AnalysisMicrosoft Outage Analysis
Microsoft Outage Analysis
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN ProjectsHow to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects
 
What is ThousandEyes Webinar
What is ThousandEyes WebinarWhat is ThousandEyes Webinar
What is ThousandEyes Webinar
 
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceOptimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
 
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceEMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
 
Owning End-to-end Application Experience With ThousandEyes
Owning End-to-end Application Experience With ThousandEyesOwning End-to-end Application Experience With ThousandEyes
Owning End-to-end Application Experience With ThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
 
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...
 
Cisco IT and ThousandEyes
Cisco IT and ThousandEyesCisco IT and ThousandEyes
Cisco IT and ThousandEyes
 
The Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and TakeawaysThe Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and Takeaways
 

More from ThousandEyes

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024ThousandEyes
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...ThousandEyes
 
AMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarAMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarThousandEyes
 
New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024ThousandEyes
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersThousandEyes
 
ThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - SpanishThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - SpanishThousandEyes
 
ThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - GermanThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - GermanThousandEyes
 
ThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital WorkshopThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital WorkshopThousandEyes
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
 
How Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsHow Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsThousandEyes
 
New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023ThousandEyes
 
New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023ThousandEyes
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersThousandEyes
 

More from ThousandEyes (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
 
AMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarAMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes Webinar
 
New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for Partners
 
ThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - SpanishThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - Spanish
 
ThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - GermanThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - German
 
ThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital WorkshopThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital Workshop
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for Partners
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
 
How Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsHow Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital Engagements
 
New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023
 
New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for Partners
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

The Top Outages of 2022: Analysis and Takeaways

  • 1. 1 © 1992–2023 Cisco Systems, Inc. All rights reserved.
  • 2. 2 © 1992–2023 Cisco Systems, Inc. All rights reserved. Featured Speaker Mike Hicks Principal Solutions Analyst
  • 3. 3 © 1992–2023 Cisco Systems, Inc. All rights reserved. Before We Begin... • If you have any questions, please type them in the Questions window. • If you have any audio problems, please chat us for help. • A recording of this presentation will be sent to you in a few days. 3 @ThousandEyes © 1992–2023 Cisco Systems, Inc. All rights reserved.
  • 4. 4 © 1992–2023 Cisco Systems, Inc. All rights reserved. Agenda • About ThousandEyes • Noteworthy Outages of 2022 • Primer: Digital Service Building Blocks • Top Ten Outage Countdown • Lessons & Takeaways • Q&A 4 @ThousandEyes
  • 5. 5 © 1992–2023 Cisco Systems, Inc. All rights reserved. Actionable Insight for Internet, Cloud, and SaaS Correlated Insights Quickly isolate issues to app, network, or service Network Visibility Overlay, hop-by-hop underlay, ISP performance, and BGP routing App Experience SaaS, API, and internal app performance and user experience
  • 6. 6 © 1992–2023 Cisco Systems, Inc. All rights reserved. 2022 Noteworthy Outages Major Significant Shadow British Airways (2/25) Twitter prefixes hijacked (3/28) Atlassian services unavailable (4/5) Rogers routing failure (7/8) AWS AZ Failure (8/9) Zoom Outage (9/15) Zscaler Internet Access Failure (10/25) WhatsApp Outage (10/25) AWS packet loss (12/5)
  • 7. 7 © 1992–2023 Cisco Systems, Inc. All rights reserved. CDN Cloud BGP DNS The Building Blocks of Today’s Digital Services SaaS
  • 8. 8 © 1992–2023 Cisco Systems, Inc. All rights reserved. DNS BGP Many Options, Complex Dependencies ISP Users CDN Your App Security
  • 9. 9 © 1992–2023 Cisco Systems, Inc. All rights reserved. DNS BGP Many Options, Complex Dependencies ISP Users CDN Your App Cloud APIs Data Center Cloud IaaS Security
  • 10. 10 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 1: DNS – Where are We Going? Users CDN Your App BGP ISP DNS Root Server TLD Server Authoritative Server
  • 11. 11 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 2: How do We Get There? Users BGP ISP DNS CDN Your App
  • 12. 12 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 3: CDNs - Do We Have to Travel So Far? Users Your App CDN BGP ISP DNS
  • 13. 13 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 4: Rinse and Repeat For Services & API Calls Your App SaaS Apps Cloud APIs Data Center Backend Services
  • 15. 15 © 1992–2023 Cisco Systems, Inc. All rights reserved. Atlassian, Apr 5, 2022 #9 #8 #10 #7 #6 Zscaler Internet Access, Oct 25, 2022 WhatsApp, Oct 25, 2022 AWS, Dec 5, 2022 Rogers, Jul 8, 2022 ~24 hours App + routing issues ~2.5 days Service unavailable/data loss Rogers withdrew its prefixes due to an internal routing issue, rendering it unreachable across the Internet for nearly 24 hours. Lesson: No provider is immune to outages. Plan for a backup network provider that can alleviate the length and scope of an outage. Customers using Zscaler Internet Access (ZIA) experienced connectivity failures or high latency in reaching Zscaler proxies. Lesson: Having network-agnostic data for complex scenarios like this can enable quicker attribution and remediation. ~30 minutes Network traffic loss ~2 hours Failure to send/receive messages ~1 hour Network traffic/packet loss Significant packet loss between 2 global locations and AWS' us- east-2 region. Lesson: it’s important to monitor not just the applications, but also the cloud infrastructure components and any dependent cloud software services. The two-hour outage left WhatsApp users unable to send or receive messages. Lesson: A thriving SaaS business relies on continuous improvement, which is why an immediate feedback loop—whereby mistakes can be rectified quickly—is necessary. Due to a maintenance script error, Atlassian services experienced a days-long outage. Lesson: One cannot rely on status pages alone to communicate about outages. Customers can be left worrying with no answer as to how serious an outage is and when it will be fixed. Outage Blog Outage Blog
  • 16. 16 © 1992–2023 Cisco Systems, Inc. All rights reserved. Zoom, September 15th, 2022 #5 • Service unavailable ~20 minutes • Users were unable to log in or join meetings • Most of the HTTP errors seen were 503 Bad Gateway responses, indicative of potential CDN issues • The service would appear to be available if just testing via IP, but looking at HTTP results/service status tells a different story Lesson: It may be that the app itself is causing issues rather than the network. Having visibility into which it is can prevent confusion and finger-pointing during root cause analysis.
  • 17. 17 © 1992–2023 Cisco Systems, Inc. All rights reserved. British Airways, February 25, 2022 #4 • Service unavailable ~20 minutes • Outage caused hundreds of flight cancellations and disruptions in the airline's operations • Network paths to the airline’s online services (and servers) were reachable, but server and site responses were timing out Lesson: Architecting backends that avoid single points of failure can reduce the likelihood of a chain of events
  • 18. 18 © 1992–2023 Cisco Systems, Inc. All rights reserved. Google, August 9, 2022 #3 • Service unavailable for ~60 minutes • Outage affected Google search and maps • During this time, Google web servers responded with HTTP 500 Internal Server Error messages, 502 bad gateway errors, and timeouts Lesson: It is important to monitor not just your application front ends but also the performance-critical dependencies that power your app. Outage Blog
  • 19. 19 © 1992–2023 Cisco Systems, Inc. All rights reserved. AWS AZ Failure, July 28th, 2022 #2 • Service unavailable ~20 minutes, ~3 hours for customers to recover • Caused by an Availability Zone power failure • Impacted applications such as Webex, Okta, and Splunk. • Affected EC2 instances and EBS volumes as well as traffic routing Lesson: Be sure to have redundant AZ architecture as they are typically active/active and remove the need to execute a backup plan. Outage Blog
  • 20. 20 © 1992–2023 Cisco Systems, Inc. All rights reserved. Twitter, March 28th, 2022 #1 • Service unavailable ~45 minutes • Twitter was rendered unreachable for some users when JSC RTComm.RU (AS 8342) announced one of Twitter’s prefixes and subsequently blackholed traffic • Since Twitter’s service is not located within RTComm’s network, any Twitter traffic destined to RTComm would have failed. Lesson: Though your company might have RPKI implemented to fend off BGP threats, it's possible that your telco won't. Something to consider when selecting ISPs. Outage Blog
  • 21. 21 © 1992–2023 Cisco Systems, Inc. All rights reserved. Lessons and Takeaways • BGP powers the Internet, but can also be misused and abused. Visibility and planning is needed to protect your network. • Public cloud is ubiquitous and reliable. But, ensure that you are monitoring all cloud dependencies. • Avoid single points of failure. Your apps are only as resilient as your architecture. • Security is essential, but it can add great complexity that requires continuous end-to-end visibility. • Whenever the infrastructure is touched, failures can occur. Visibility is critical before and after each network change to avoid impacts.
  • 22. © 1992–2023 Cisco Systems, Inc. All rights reserved. 22 @ThousandEyes Learn more Free Trial / Demo Next Steps Copyright ©2023 ThousandEyes • Subscribe! https://blog.thousandeyes.com • Get a real-time view of the health of the Internet https://thousandeyes.com/outages • Sign up for a Free Trial: https://www.thousandeyes.com/signup • Request a demo: https://www.thousandeyes.com/request-demo
  • 23. Q&A