SlideShare a Scribd company logo
1 of 23
Analyses and Takeaways
Featured speakers
Mike Hicks
Principal Solutions Analyst
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
3
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Before We Begin…
• If you have any questions, please type them in the Questions window.
• If you have any audio problems, please chat us for help.
• A recording of this presentation will be sent to you in a few days.
• Interested in more outage analysis and Internet insights? Check out the ThousandEyes
blog and The Internet Report podcast.
Anatomy of an Outage
• Understanding different types of
Internet outages is important to
mitigate their impact.
• Outages can vary in blast radius, be
planned or unplanned, and have
varying MTTR.
• Network outages depend on where
the problem occurs, with transit
network incidents impacting multiple
providers.
• Tracking outages can help teams
identify patterns and prevent
customer service disruptions.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
5
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Outage and Degradation Impacts
BGP
ISP
CDN
DNS SaaS Apps
Services
APIs
Data Center
Cloud
DDoS
Protection
SSE
RISK AND
COMPLIANCE
Is our traffic
getting routed
out of region?
SERVICE
AVAILABILITY
Which cloud
regions are
impacted?
SITUATIONAL
AWARENESS
Are regional
ISPs spoofing
our DNS
records?
SERVICE
RECOVERY
Did we
successfully
cut over to
our DDoS
mitigation
service?
NETWORK
SECURITY
Are SASE routing
policies working
as we expect?
CUSTOMER
SUPPORT
Is an Internet
outage
preventing users
from reaching our
service?
WORKFORCE
PRODUCTIVITY
Will our Salesforce
dev updates
degrade
performance
for some global
users?
$32,000
$120,000
$3,500
3474
REVENUE
PROTECTION
Is the payment
gateway down
or just
unreachable?
2023 Outages by the Numbers: ISP Compared to CSP
• ThousandEyes reported an increase in cloud service provider (CSP) outages in 2023.
• CSP outages are the second most common type of disruption after ISP outages.
• The ratio of CSP outages to ISP outages increased in 2023.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
2023 Outages by the Numbers:
U.S.-centric Outages in Relation to Global Outages
• U.S.-centric outages increased to 37% in 2023 from 34% in 2022.
• Smaller, contained outages are becoming more common.
• Localized outages have different impacts and require different responses compared to global outages.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
2023 Outages by the Numbers: Application Outages
• The number and frequency of application outages have been on the rise over the past year.
• Application-related disruptions can have a bigger impact than network outages, though they are not as common.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
9
Connections Are Complex
Branch
Office
Employee
BYOD
Corp devices
IOT
Cameras
and sensors
IoT
VDI
People, places,
and things
Edge
BYOD
Data Center
IOT
Core
network
Mobile
networks
Core
network
Peering
Access
networks
Wireless
network
Wireless
gateway
DNS
Cloud
and SaaS
Cloud
providers
Datacenter
infrastructure
Cloud
connectivity
Direct
connect
ISP transit
providers
SaaS
onramp
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
10
Correlate Performance Across Every Layer
8
3
9
3
5
4
6
8
6
Time Correlated
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Microsoft
(1/25)
Outlook
(2/7)
Virgin Media
(4/4)
AWS
(6/13)
Slack
(8/2)
Square
(9/8)
Workday +
Cloudflare (11/2)
2023 Outage Timeline
Purple = Application Outage
Red = Network Outage
Blue = Infrastructure Outage
Bookmark the Internet
Outages Timeline for outage
updates throughout the year.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Slack (Aug 2, 2023)
~2 hours
Unable to send/receive messages
AWS (Jun 13, 2023)
~2 hours
Latency, server timeouts, and HTTP errors
Virgin Media (Apr 4, 2023)
~7 hours
Network traffic loss/BGP route withdrawal
Microsoft Outlook (Feb 7, 2023)
~2 hours
Service unavailable/application errors
Microsoft 365 (Jan 25, 2023)
~90 minutes
Network issues due to BGP changes
#2
#3
#1
#4
#5
Square (Sept 8, 2023)
~12 hours
App errors and backend transactions failing
#6
Workday + Cloudflare (Nov 2, 2023)
~36 hours
Application and service outages
#7
Microsoft 365 (1/25/23)
• Microsoft started experiencing service related issues around
07:05 AM (UTC).
• The disruption was triggered by an external BGP change
by Microsoft that impacted connected service providers
• Microsoft BGP prefixes were withdrawn completely
but then almost immediately re-advertised.
• Affected smaller (/24) prefixes and summary prefixes (/12).
• Cascading impact on global routing tables, causing
significant churn.
• Prefixes were either withdrawn or re-advertised to
transit providers.
• Large amount of packet loss were seen as well as
HTTP and DNS timeouts.
• Timeouts seen in the application “Response,” further
indicating the effect of the network on service availability.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Microsoft Outlook (2/7/23)
• Starting around 03:55 UTC,
Outlook became unavailable.
• Network path was working
properly, but ThousandEyes
observed elevated server
response timeouts and slow
page loading.
• Majority of the errors were
HTTP server timeouts,
indicating an application issue.
• Incident was mostly
concentrated in the U.S. and
lasted ~2 hours.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Virgin Media (4/4/23)
• From approximately 00:30 to 17:30 UTC,
two outages impacted the reachability of
Virgin Media UK’s network and services.
• The first incident began at approximately
00:30 UTC and appeared to coincide with
a series of BGP route withdrawals.
• Second incident was shorter, but the
networks experienced similar BGP and
reachability issues.
• Outages were overnight and due to the
repeat nature, could indicate
maintenance issues.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
AWS (6/13/23)
• Outage impacted services within US-
EAST-1.
• Lasted two hours and increased latency,
server timeouts, and HTTP server errors
were observed.
• AWS console access was also affected,
making troubleshooting difficult.
• AWS confirmed the issue was due to a
capacity management subsystem failure.
• Organizations leveraging cloud services,
such as those offered by AWS, should be
aware of the relationships in their digital
ecosystem, regardless of whether those
relationships are services or networks.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Slack (8/2/23)
• Application outage that lasted from 4:01
PM to 6 PM (UTC).
• Network paths and accessibility were
unaffected.
• Initially could be seen as HTTP 500
errors and higher-than-normal page load
times.
• During the outage, users were unable to
upload files or share screenshots.
• Root cause—work on a “routine
database cluster migration”—that
accidentally reduced database capacity
to the point that it could not support a
regularly scheduled job running.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Square (9/8/23)
• Outage lasted over 18 hours.
• Backend issue that prevented
the platform from processing
payment transactions.
• Users reported various
problems, from terminal
connections dropping out, to
payments appearing to
complete but then not showing
up in business accounts.
• ThousandEyes observed
intermittent dropouts and 503
‘service unavailable’ errors.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Workday + Cloudflare (11/2/23)
• Cloudflare and Workday
experienced a major outage
due to multiple infrastructure
provider failures.
• DR resources took 6 hours to
come online and full resolution
took 36 hours.
• Initial cause was a partial mains
power outage at a Flexential
data center in Portland.
• Further generator and grid
failures resulted in a complete
power loss and ungraceful
shutdown.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Takeaways
• Understanding how your application works is important for quickly identifying failures and making
improvements.
• Just because your application is working doesn't mean it's functioning optimally.
• Knowing how all parts of the service work together is crucial for ongoing design and future optimizations.
• Improved visibility and operational optimizations can prevent outages and minimize their impact.
• Tracking different categories of outages and degradations over time can be helpful.
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
• Subscribe to our blog to keep up-to-date!
thousandeyes.com/blog/
• Tune in to The Internet Report Podcast.
https://www.thousandeyes.com/the-internet-report/
21
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
Next Steps
• New tutorial videos on our features
thousandeyes.com/resources/?cat=tutorial
• New Getting Started Guides
docs.thousandeyes.com/product-documentation/getting-started
Blog and
Podcast
Learning
Resources
Support
Community
• Still have questions? Ask us on the ThousandEyes
Support Community AMA: http://bit.ly/2023Outages
Q&A
22
© 2023 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
23
© 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.

More Related Content

Similar to The Top Outages of 2023: Analysis and Takeaways

Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN ProjectsHow to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN ProjectsThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
The Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and TakeawaysThe Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and TakeawaysThousandEyes
 
How to Evaluate, Rollout, and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout, and Operationalize Your SD-WAN ProjectsHow to Evaluate, Rollout, and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout, and Operationalize Your SD-WAN ProjectsThousandEyes
 
Is Your Network Ready?
Is Your Network Ready?Is Your Network Ready?
Is Your Network Ready?Brocade
 
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...
Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...DellNMS
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
Introduction To ThousandEyes
Introduction To ThousandEyesIntroduction To ThousandEyes
Introduction To ThousandEyesThousandEyes
 
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdfSaurabh Chauhan
 
It nv51 instructor_ppt_ch11
It nv51 instructor_ppt_ch11It nv51 instructor_ppt_ch11
It nv51 instructor_ppt_ch11newbie2019
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
 
Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...
Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...
Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...Cisco Canada
 
06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...
06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...
06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...ThousandEyes
 
Cisco Digital Network Architecture - Introducing the Network Intuitive
Cisco Digital Network Architecture - Introducing the Network IntuitiveCisco Digital Network Architecture - Introducing the Network Intuitive
Cisco Digital Network Architecture - Introducing the Network IntuitiveCisco Canada
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
 

Similar to The Top Outages of 2023: Analysis and Takeaways (20)

Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN ProjectsHow to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
The Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and TakeawaysThe Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and Takeaways
 
How to Evaluate, Rollout, and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout, and Operationalize Your SD-WAN ProjectsHow to Evaluate, Rollout, and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout, and Operationalize Your SD-WAN Projects
 
Is Your Network Ready?
Is Your Network Ready?Is Your Network Ready?
Is Your Network Ready?
 
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...
Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...Visualizing Your Network Health -  Driving Visibility in Increasingly Complex...
Visualizing Your Network Health - Driving Visibility in Increasingly Complex...
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
Introduction To ThousandEyes
Introduction To ThousandEyesIntroduction To ThousandEyes
Introduction To ThousandEyes
 
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
 
It nv51 instructor_ppt_ch11
It nv51 instructor_ppt_ch11It nv51 instructor_ppt_ch11
It nv51 instructor_ppt_ch11
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
Introduction to ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
 
Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...
Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...
Enterprise Networks - Cisco Digital Network Architecture - Introducing the Ne...
 
06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...
06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...
06_08_emea_how_to_evaluate_rollout_and_operationalize_your_sdwan_projects_web...
 
Cisco Digital Network Architecture - Introducing the Network Intuitive
Cisco Digital Network Architecture - Introducing the Network IntuitiveCisco Digital Network Architecture - Introducing the Network Intuitive
Cisco Digital Network Architecture - Introducing the Network Intuitive
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
 

More from ThousandEyes

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024ThousandEyes
 
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...ThousandEyes
 
AMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarAMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarThousandEyes
 
New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024ThousandEyes
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersThousandEyes
 
The Top Outages of 2023: Analysis and Takeaways
The Top Outages of 2023: Analysis and TakeawaysThe Top Outages of 2023: Analysis and Takeaways
The Top Outages of 2023: Analysis and TakeawaysThousandEyes
 
ThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - SpanishThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - SpanishThousandEyes
 
ThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - GermanThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - GermanThousandEyes
 
ThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital WorkshopThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital WorkshopThousandEyes
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
 
How Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsHow Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsThousandEyes
 
New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023ThousandEyes
 
New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023ThousandEyes
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersThousandEyes
 

More from ThousandEyes (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024New ThousandEyes Product Features and Release Highlights: March 2024
New ThousandEyes Product Features and Release Highlights: March 2024
 
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
Assure Patient and Clinician Digital Experiences with ThousandEyes for Health...
 
AMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarAMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes Webinar
 
New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024New ThousandEyes Product Features and Release Highlights: February 2024
New ThousandEyes Product Features and Release Highlights: February 2024
 
Enhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for PartnersEnhancing SaaS Performance: A Hands-on Workshop for Partners
Enhancing SaaS Performance: A Hands-on Workshop for Partners
 
The Top Outages of 2023: Analysis and Takeaways
The Top Outages of 2023: Analysis and TakeawaysThe Top Outages of 2023: Analysis and Takeaways
The Top Outages of 2023: Analysis and Takeaways
 
ThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - SpanishThousandEyes Enterprise Digital Workshop - Spanish
ThousandEyes Enterprise Digital Workshop - Spanish
 
ThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - GermanThousandEyes Enterprise Digital Workshop - German
ThousandEyes Enterprise Digital Workshop - German
 
ThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital WorkshopThousandEyes Enterprise Digital Workshop
ThousandEyes Enterprise Digital Workshop
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for Partners
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
 
Level-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
 
How Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsHow Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital Engagements
 
New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023
 
New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023
 
Introduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for Partners
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

The Top Outages of 2023: Analysis and Takeaways

  • 2. Featured speakers Mike Hicks Principal Solutions Analyst © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 3. 3 © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved. Before We Begin… • If you have any questions, please type them in the Questions window. • If you have any audio problems, please chat us for help. • A recording of this presentation will be sent to you in a few days. • Interested in more outage analysis and Internet insights? Check out the ThousandEyes blog and The Internet Report podcast.
  • 4. Anatomy of an Outage • Understanding different types of Internet outages is important to mitigate their impact. • Outages can vary in blast radius, be planned or unplanned, and have varying MTTR. • Network outages depend on where the problem occurs, with transit network incidents impacting multiple providers. • Tracking outages can help teams identify patterns and prevent customer service disruptions. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 5. 5 © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved. Outage and Degradation Impacts BGP ISP CDN DNS SaaS Apps Services APIs Data Center Cloud DDoS Protection SSE RISK AND COMPLIANCE Is our traffic getting routed out of region? SERVICE AVAILABILITY Which cloud regions are impacted? SITUATIONAL AWARENESS Are regional ISPs spoofing our DNS records? SERVICE RECOVERY Did we successfully cut over to our DDoS mitigation service? NETWORK SECURITY Are SASE routing policies working as we expect? CUSTOMER SUPPORT Is an Internet outage preventing users from reaching our service? WORKFORCE PRODUCTIVITY Will our Salesforce dev updates degrade performance for some global users? $32,000 $120,000 $3,500 3474 REVENUE PROTECTION Is the payment gateway down or just unreachable?
  • 6. 2023 Outages by the Numbers: ISP Compared to CSP • ThousandEyes reported an increase in cloud service provider (CSP) outages in 2023. • CSP outages are the second most common type of disruption after ISP outages. • The ratio of CSP outages to ISP outages increased in 2023. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 7. 2023 Outages by the Numbers: U.S.-centric Outages in Relation to Global Outages • U.S.-centric outages increased to 37% in 2023 from 34% in 2022. • Smaller, contained outages are becoming more common. • Localized outages have different impacts and require different responses compared to global outages. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 8. 2023 Outages by the Numbers: Application Outages • The number and frequency of application outages have been on the rise over the past year. • Application-related disruptions can have a bigger impact than network outages, though they are not as common. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 9. 9 Connections Are Complex Branch Office Employee BYOD Corp devices IOT Cameras and sensors IoT VDI People, places, and things Edge BYOD Data Center IOT Core network Mobile networks Core network Peering Access networks Wireless network Wireless gateway DNS Cloud and SaaS Cloud providers Datacenter infrastructure Cloud connectivity Direct connect ISP transit providers SaaS onramp © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 10. 10 Correlate Performance Across Every Layer 8 3 9 3 5 4 6 8 6 Time Correlated © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 11. Microsoft (1/25) Outlook (2/7) Virgin Media (4/4) AWS (6/13) Slack (8/2) Square (9/8) Workday + Cloudflare (11/2) 2023 Outage Timeline Purple = Application Outage Red = Network Outage Blue = Infrastructure Outage Bookmark the Internet Outages Timeline for outage updates throughout the year. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 12. Slack (Aug 2, 2023) ~2 hours Unable to send/receive messages AWS (Jun 13, 2023) ~2 hours Latency, server timeouts, and HTTP errors Virgin Media (Apr 4, 2023) ~7 hours Network traffic loss/BGP route withdrawal Microsoft Outlook (Feb 7, 2023) ~2 hours Service unavailable/application errors Microsoft 365 (Jan 25, 2023) ~90 minutes Network issues due to BGP changes #2 #3 #1 #4 #5 Square (Sept 8, 2023) ~12 hours App errors and backend transactions failing #6 Workday + Cloudflare (Nov 2, 2023) ~36 hours Application and service outages #7
  • 13. Microsoft 365 (1/25/23) • Microsoft started experiencing service related issues around 07:05 AM (UTC). • The disruption was triggered by an external BGP change by Microsoft that impacted connected service providers • Microsoft BGP prefixes were withdrawn completely but then almost immediately re-advertised. • Affected smaller (/24) prefixes and summary prefixes (/12). • Cascading impact on global routing tables, causing significant churn. • Prefixes were either withdrawn or re-advertised to transit providers. • Large amount of packet loss were seen as well as HTTP and DNS timeouts. • Timeouts seen in the application “Response,” further indicating the effect of the network on service availability. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 14. Microsoft Outlook (2/7/23) • Starting around 03:55 UTC, Outlook became unavailable. • Network path was working properly, but ThousandEyes observed elevated server response timeouts and slow page loading. • Majority of the errors were HTTP server timeouts, indicating an application issue. • Incident was mostly concentrated in the U.S. and lasted ~2 hours. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 15. Virgin Media (4/4/23) • From approximately 00:30 to 17:30 UTC, two outages impacted the reachability of Virgin Media UK’s network and services. • The first incident began at approximately 00:30 UTC and appeared to coincide with a series of BGP route withdrawals. • Second incident was shorter, but the networks experienced similar BGP and reachability issues. • Outages were overnight and due to the repeat nature, could indicate maintenance issues. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 16. AWS (6/13/23) • Outage impacted services within US- EAST-1. • Lasted two hours and increased latency, server timeouts, and HTTP server errors were observed. • AWS console access was also affected, making troubleshooting difficult. • AWS confirmed the issue was due to a capacity management subsystem failure. • Organizations leveraging cloud services, such as those offered by AWS, should be aware of the relationships in their digital ecosystem, regardless of whether those relationships are services or networks. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 17. Slack (8/2/23) • Application outage that lasted from 4:01 PM to 6 PM (UTC). • Network paths and accessibility were unaffected. • Initially could be seen as HTTP 500 errors and higher-than-normal page load times. • During the outage, users were unable to upload files or share screenshots. • Root cause—work on a “routine database cluster migration”—that accidentally reduced database capacity to the point that it could not support a regularly scheduled job running. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 18. Square (9/8/23) • Outage lasted over 18 hours. • Backend issue that prevented the platform from processing payment transactions. • Users reported various problems, from terminal connections dropping out, to payments appearing to complete but then not showing up in business accounts. • ThousandEyes observed intermittent dropouts and 503 ‘service unavailable’ errors. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 19. Workday + Cloudflare (11/2/23) • Cloudflare and Workday experienced a major outage due to multiple infrastructure provider failures. • DR resources took 6 hours to come online and full resolution took 36 hours. • Initial cause was a partial mains power outage at a Flexential data center in Portland. • Further generator and grid failures resulted in a complete power loss and ungraceful shutdown. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 20. Takeaways • Understanding how your application works is important for quickly identifying failures and making improvements. • Just because your application is working doesn't mean it's functioning optimally. • Knowing how all parts of the service work together is crucial for ongoing design and future optimizations. • Improved visibility and operational optimizations can prevent outages and minimize their impact. • Tracking different categories of outages and degradations over time can be helpful. © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 21. • Subscribe to our blog to keep up-to-date! thousandeyes.com/blog/ • Tune in to The Internet Report Podcast. https://www.thousandeyes.com/the-internet-report/ 21 © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved. Next Steps • New tutorial videos on our features thousandeyes.com/resources/?cat=tutorial • New Getting Started Guides docs.thousandeyes.com/product-documentation/getting-started Blog and Podcast Learning Resources Support Community • Still have questions? Ask us on the ThousandEyes Support Community AMA: http://bit.ly/2023Outages
  • 22. Q&A 22 © 2023 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
  • 23. 23 © 2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.