SlideShare a Scribd company logo
Embracing Failure
The art of being at the edge
Thanks to
Embracing Failure
«Failures are given, and everything will eventually
fail over time»
(Werner Vogels – CTO Amazon)
Embracing Failure
The art of being at the edge
Change Mindset
Building a reliable application in the cloud is different
than building a reliable application in an enterprise
setting
A new Mindset is needed.
Eight Fallacies of Distributed Computing
- The network is reliable
- Latency is zero
- Bandwidth is infinite
- The network is secure
- Topology doesn’t exist
- There is one administrator
- Transport cost is zero
- The network is homogeneous
Peter Deutsch
Conway's law
On-premises Application
- Before the Cloud, users were connected
to our applications through the
Company's local network;
- A server's downtime was planned and
involved stopping production
- Conway’s law model
Modern Application
- Now our users connect through the
Internet
- The workload to which our services are
subjected will increase significantly,
thanks to the greater spread of the
applications themselves.
- Many Microservices replace Monolithic
Microservices: is it really a matter of sizes?
We cannot say there is a formal definition of the
microservices architectural style, but we can attempt to
describe what we see as common characteristics for
architectures that fit the label.
Common Characteristics
Componentisation via services
Organised around business capabilities
Decentralised data management
Products not projects
Decentralised governance
Smart endpoints and dumb pipes
Evolutionary design
Infrastructure automation
?????????????
(Martin Fowler, James Lewis)
Microservices: or a question of Business?
Or is it a matter of paradigms?
Sync Communication (e.g. http)
Async Communication (e.g. ServiceBus)
VS
Reactive Manifesto (16.01.2014)
• (Jones Boner, Dave Farley, Roland Kuhn, Martin Thompson)
• The absolute, most import thing is it needs to be responsive.
This means that a reactive system needs to remain responsive event when a failure occurs.
Responsive
“The system responds in a timely manner if at all possible. Responsiveness is the cornerstone of usability and utility,
but more than that, responsiveness means that problems may be detected quickly and dealt with effectively.”
https://www.reactivemanifesto.org/it
Availability
Availability Downtime per year Categories
95% (1-nine) 18 days 6 hours Batch processing, Data extraction, Load jobs
99% (2-nines) 3 days 15 hours Internal Tools, Project Tracking
99.9% (3-nines) 8 hours 45 minutes Online Commerce
99.99% (4-nines) 52 minutes Video Delivery, Broadcast systems
99.999% (5-nines) 5 minutes Telecom Industry (ATM Transactions)
99.9999% (6-nines) 31 seconds Answering to me loved one
Availability
The beauty of Math at work!
Component Availability Downtime
X 99% (2-nines) 3 days 15 hours
Y 99.99% (4-nines) 52 minutes
X and Y Combined 98.99% 3 days 16 hours 33 minutes
Component Availability Downtime
X 99% (2-nines) 3 days 15 hours
Two X in parallel 99.99% (4-nines) 52 minutes
Three X in parallel 99.9999% (6-nines) 31 seconds
Reactive Manifesto - Resilient
Resilient
• Resilient systems embrace the idea that failures are normal and that it
is perfectly acceptable to run systems in what we call partially failing
mode.
Services resiliency
All Azure management services are architected to be resilient
from region-level failures. In the spectrum of failures, one or
more Availability Zone failures within a region have a smaller
failure radius compared to an entire region failure. Azure can
recover from a zone-level failure of management services
within the region or from another Azure region. Azure
performs critical maintenance one zone at a time within a
region, to prevent any failures impacting customer resources
deployed across Availability Zones within a region.
Azure solution
• Availability Zones
• Zonal services: you pin the resource to a specific zone (for example,
virtual machines, managed disks, Standard IP addresses)
• Zone-redundant services: platform replicates automatically across
zones (for example, zone-redundant storage, SQL Database)
What are Availability Zones in Azure?
Reactive Manifesto - Elastic
Elastic
The degree to which a system is able to
adapt to workload changes by provisioning
and de-provisioning resources in an
autonomic manner, such that at each
point in time the available resources
match the current demand as closely as
possible.
• In free and shared service plan, you cannot scale the
application as only one instance is available.
• In basic plan, you can scale the application manually. This
means you have to check the metrics manually to see if
more instances are needed and then can increase or
decrease them from your Azure management portal.
• In standard and premium plan, you can choose to auto
scale based on few parameters.
Azure solution
• The code that we use for scripting (PowerShell or bash) …
it’s code. So we have to treat him as such.
Reactive Manifesto – Message Driven
Guaranteering Delivery
- The Two Generals Problem
- When we have an unreliable network, which we always do, we cannot guarantee message receipt.
- Instead we must be satisfied with either
- At Most Once
- At Least Once
- Exactly Once
• Event Grid
• Event Hubs
• Service Bus
Azure solution
SERVICE PURPOSE TYPE WHEN TO USE
Event Grid Reactive programming Event distribution (discrete) React to status changes
Event Hubs Big data pipeline Event streaming (series) Telemetry and distributed
data streaming
Service Bus High-value enterprise
messaging
Message Order processing and
financial transaction
Chaos Engineering
Before starting your journey into chaos engineering, make sure you’ve done your homework and have built resiliency
into every level of your organization. Building resilient systems isn’t all about software. It starts at the infrastructure
layer, progresses to the network and data, influences application design and extends to people and culture.
Adrian Hornsby
Chaos Engineering
- Chaos engineering is a technique to meet the resilience requirement.
- Chaos engineering can be use to achieve resilience against
- Infrastructure failures
- Network failures
- Application failures
The logo for Chaos Monkey used by
Netflix
Is the discipline of experimenting on a software system in production in order
to build confidence in the system's capability to withstand turbulent and
unexpected conditions.
Which Chaos Engineering Experiments?
The Phases of Chaos Engineering
It’s important to understand that chaos engineering is NOT about letting monkeys loose or allowing them to break
things randomly without a purpose. Chaos engineering is about breaking things in a controlled environment, through
well-planned experiments in order to build confidence in your application to withstand turbulent conditions.
https://medium.com/@adhorn/chaos-engineering-ab0cc9fbd12a
Canary Deployment
Canary deployment: Start small, and slowly build confidence within your team and your organization
- How many customers
are affected?
- What functionality is
impaired?
- Which locations are
imapcted?
New Tools
One of the most efficient methods for uncovering misalignments in software is put the code together and
run it. Continuos Integration was promoted heavily as part of XP methodology as a way to achieve this
and is now a common industry norm.
Continuos Delivery builds on the success of CI by automated the steps of preparing code and deploying it
to an environment. CD tools allow engineers to choose a build that passed the CI stage and promote that
through the pipeline to run in production.
Like CI/CD, Continuos Verification is born out of a need to navigate increasingly complex systems. Modern
organizations can’t validate that the internal machinations of the system work as intended, so instead
they verify that the output of the system matches expectations.
Benefits of Chaos Engineering
- Customer: the increased availability and durability of
service means no outages disrupt their day-to-day lives.
- Business: Chaos Engineering can help prevent
extremely large losses in revenue and maintenance
costs, create happier and more engaged engineers,
improve in on-call training for engineering teams
- Technical: the insights from chaos experiments can
mean a reduction in incidents, reduction in on-call
burden, increased understanding of system failure
modes, improved system design
Designed for failure
Common Characteristics
Componentisation via services
Organised around business capabilities
Decentralised data management
Products not projects
Decentralised governance
Smart endpoints and dumb pipes
Evolutionary design
Infrastructure automation
designed for failure
Chaos Engineering is an experiment to ensure that the
impact of failures is mitigated.
Adrian Crockcroft
Tools don’t create reliability.
Human do.
@CaseyRosenthal
Thank You!!!
Tools don’t create reliability.
Human do.
[But tools can help.]
@CaseyRosenthal
Thank You!!!
• Reactive Manifesto
• Asynchronous Message-Based-Communication (Microsoft)
• Patterns For Resilient Architecture (Medium)
• The Quest for Availability
• Chaos Engineering
• Availability modes for an Always On availability group
• Configure availability group on Azure SQL Server VM manually
Resources
Thanks to
@aacerbis
Linkedin
alberto.acerbis@4solid.it
Software Architect

More Related Content

What's hot

Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...
Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...
Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...InteractiveNEC
 
Deep Dive into Disaster Recovery in the Cloud
Deep Dive into Disaster Recovery in the CloudDeep Dive into Disaster Recovery in the Cloud
Deep Dive into Disaster Recovery in the CloudBluelock
 
Continuous Engineering with IBM Rational RELM
Continuous Engineering with IBM Rational RELMContinuous Engineering with IBM Rational RELM
Continuous Engineering with IBM Rational RELMgjuljo
 
Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...
Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...
Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...Symantec
 
Ca technology exchange virtualization
Ca technology exchange   virtualizationCa technology exchange   virtualization
Ca technology exchange virtualizationrsravi
 
CloudOne Continuous Engineering for IoT
CloudOne Continuous Engineering for IoTCloudOne Continuous Engineering for IoT
CloudOne Continuous Engineering for IoTBenjamin Chodroff
 
Mitel Virtual Solutions[1]
Mitel Virtual Solutions[1]Mitel Virtual Solutions[1]
Mitel Virtual Solutions[1]BobSMitel
 
BusinessIntelligenze - On Cloud BI (English)
BusinessIntelligenze - On Cloud BI (English)BusinessIntelligenze - On Cloud BI (English)
BusinessIntelligenze - On Cloud BI (English)BusinessIntelligenze
 
Disaster recovery with cloud computing
Disaster recovery with cloud computingDisaster recovery with cloud computing
Disaster recovery with cloud computingIsrael Roy Sambu
 
Stationarity is the new speed
Stationarity is the new speedStationarity is the new speed
Stationarity is the new speedMartin Geddes
 
Expanding our Understanding: Complex Adaptive Systems
Expanding our Understanding: Complex Adaptive SystemsExpanding our Understanding: Complex Adaptive Systems
Expanding our Understanding: Complex Adaptive SystemsJon Stevens-Hall
 
Flying blind white_paper-final
Flying blind white_paper-finalFlying blind white_paper-final
Flying blind white_paper-finalCreate.io
 

What's hot (18)

Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...
Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...
Smart Enterprise Drivers 2020 - Strategic Realities Reshaping the Smart Enter...
 
Deep Dive into Disaster Recovery in the Cloud
Deep Dive into Disaster Recovery in the CloudDeep Dive into Disaster Recovery in the Cloud
Deep Dive into Disaster Recovery in the Cloud
 
Continuous Engineering with IBM Rational RELM
Continuous Engineering with IBM Rational RELMContinuous Engineering with IBM Rational RELM
Continuous Engineering with IBM Rational RELM
 
Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...
Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...
Symantec Disaster Recovery Orchestrator: One Click Disaster Recovery to the C...
 
Ca technology exchange virtualization
Ca technology exchange   virtualizationCa technology exchange   virtualization
Ca technology exchange virtualization
 
Classrooms - Anywhere, Anytime! - Geoff Green, MCPc
Classrooms - Anywhere, Anytime! - Geoff Green, MCPcClassrooms - Anywhere, Anytime! - Geoff Green, MCPc
Classrooms - Anywhere, Anytime! - Geoff Green, MCPc
 
CloudOne Continuous Engineering for IoT
CloudOne Continuous Engineering for IoTCloudOne Continuous Engineering for IoT
CloudOne Continuous Engineering for IoT
 
Living in the Cloud
Living in the CloudLiving in the Cloud
Living in the Cloud
 
Job Postings
Job PostingsJob Postings
Job Postings
 
Thought_Frameworks_Brochure
Thought_Frameworks_BrochureThought_Frameworks_Brochure
Thought_Frameworks_Brochure
 
Up in the Clouds
Up in the CloudsUp in the Clouds
Up in the Clouds
 
Mitel Virtual Solutions[1]
Mitel Virtual Solutions[1]Mitel Virtual Solutions[1]
Mitel Virtual Solutions[1]
 
BusinessIntelligenze - On Cloud BI (English)
BusinessIntelligenze - On Cloud BI (English)BusinessIntelligenze - On Cloud BI (English)
BusinessIntelligenze - On Cloud BI (English)
 
Disaster recovery with cloud computing
Disaster recovery with cloud computingDisaster recovery with cloud computing
Disaster recovery with cloud computing
 
Stationarity is the new speed
Stationarity is the new speedStationarity is the new speed
Stationarity is the new speed
 
Yes to virtualization projects but dont virtualize waste
Yes to virtualization projects but dont virtualize wasteYes to virtualization projects but dont virtualize waste
Yes to virtualization projects but dont virtualize waste
 
Expanding our Understanding: Complex Adaptive Systems
Expanding our Understanding: Complex Adaptive SystemsExpanding our Understanding: Complex Adaptive Systems
Expanding our Understanding: Complex Adaptive Systems
 
Flying blind white_paper-final
Flying blind white_paper-finalFlying blind white_paper-final
Flying blind white_paper-final
 

Similar to Embracing Failure - AzureDay Rome

Wicsa2011 cloud tutorial
Wicsa2011 cloud tutorialWicsa2011 cloud tutorial
Wicsa2011 cloud tutorialAnna Liu
 
Agile and continuous delivery – How IBM Watson Workspace is built
Agile and continuous delivery – How IBM Watson Workspace is builtAgile and continuous delivery – How IBM Watson Workspace is built
Agile and continuous delivery – How IBM Watson Workspace is builtVincent Burckhardt
 
Predicting the Future of Endpoint Management in a Mobile World
Predicting the Future of Endpoint Management in a Mobile WorldPredicting the Future of Endpoint Management in a Mobile World
Predicting the Future of Endpoint Management in a Mobile WorldQuest
 
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...Amazon Web Services
 
Migrating to cloud-native_app_architectures_pivotal
Migrating to cloud-native_app_architectures_pivotalMigrating to cloud-native_app_architectures_pivotal
Migrating to cloud-native_app_architectures_pivotalkkdlavak3
 
Migrating_to_Cloud-Native_App_Architectures_Pivotal
Migrating_to_Cloud-Native_App_Architectures_PivotalMigrating_to_Cloud-Native_App_Architectures_Pivotal
Migrating_to_Cloud-Native_App_Architectures_PivotalEstevan McCalley
 
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)Dean Bruckman
 
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)Tim Kirby
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science PlatformDecision Science Community
 
Cloud strategy briefing 101
Cloud strategy briefing 101 Cloud strategy briefing 101
Cloud strategy briefing 101 Predrag Mitrovic
 
Red Hat Ansible Client presentation Level 2.PPTX
Red Hat Ansible Client presentation Level 2.PPTXRed Hat Ansible Client presentation Level 2.PPTX
Red Hat Ansible Client presentation Level 2.PPTXAlejandro Daricz
 
Declare Victory with Big Data
Declare Victory with Big DataDeclare Victory with Big Data
Declare Victory with Big DataJ On The Beach
 
From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018Christophe Rochefolle
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startupsSekhar Mohanty
 
Effektives Consulting - Performance Engineering
Effektives Consulting - Performance EngineeringEffektives Consulting - Performance Engineering
Effektives Consulting - Performance Engineeringhitdhits
 
Brighttalk understanding the promise of sde - final
Brighttalk   understanding the promise of sde - finalBrighttalk   understanding the promise of sde - final
Brighttalk understanding the promise of sde - finalAndrew White
 
Cloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesCloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesAl Sabawi
 
Insurtech, Cloud and Cybersecurity - Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity -  Chartered Insurance InstituteInsurtech, Cloud and Cybersecurity -  Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity - Chartered Insurance InstituteHenrique Centieiro
 

Similar to Embracing Failure - AzureDay Rome (20)

Introduction to Chaos Engineering
Introduction to Chaos EngineeringIntroduction to Chaos Engineering
Introduction to Chaos Engineering
 
Wicsa2011 cloud tutorial
Wicsa2011 cloud tutorialWicsa2011 cloud tutorial
Wicsa2011 cloud tutorial
 
Agile and continuous delivery – How IBM Watson Workspace is built
Agile and continuous delivery – How IBM Watson Workspace is builtAgile and continuous delivery – How IBM Watson Workspace is built
Agile and continuous delivery – How IBM Watson Workspace is built
 
Predicting the Future of Endpoint Management in a Mobile World
Predicting the Future of Endpoint Management in a Mobile WorldPredicting the Future of Endpoint Management in a Mobile World
Predicting the Future of Endpoint Management in a Mobile World
 
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...
AWS Public Sector Symposium 2014 Canberra | Putting the "Crowd" to work in th...
 
Migrating to cloud-native_app_architectures_pivotal
Migrating to cloud-native_app_architectures_pivotalMigrating to cloud-native_app_architectures_pivotal
Migrating to cloud-native_app_architectures_pivotal
 
Migrating_to_Cloud-Native_App_Architectures_Pivotal
Migrating_to_Cloud-Native_App_Architectures_PivotalMigrating_to_Cloud-Native_App_Architectures_Pivotal
Migrating_to_Cloud-Native_App_Architectures_Pivotal
 
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
 
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
Migrating_to_Cloud-Native_App_Architectures_Pivotal (2)
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
Cloud strategy briefing 101
Cloud strategy briefing 101 Cloud strategy briefing 101
Cloud strategy briefing 101
 
Red Hat Ansible Client presentation Level 2.PPTX
Red Hat Ansible Client presentation Level 2.PPTXRed Hat Ansible Client presentation Level 2.PPTX
Red Hat Ansible Client presentation Level 2.PPTX
 
Declare Victory with Big Data
Declare Victory with Big DataDeclare Victory with Big Data
Declare Victory with Big Data
 
From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startups
 
Effektives Consulting - Performance Engineering
Effektives Consulting - Performance EngineeringEffektives Consulting - Performance Engineering
Effektives Consulting - Performance Engineering
 
Brighttalk understanding the promise of sde - final
Brighttalk   understanding the promise of sde - finalBrighttalk   understanding the promise of sde - final
Brighttalk understanding the promise of sde - final
 
Cloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium BusinessesCloud Computing for Small & Medium Businesses
Cloud Computing for Small & Medium Businesses
 
Introduction to DevOps
Introduction to DevOpsIntroduction to DevOps
Introduction to DevOps
 
Insurtech, Cloud and Cybersecurity - Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity -  Chartered Insurance InstituteInsurtech, Cloud and Cybersecurity -  Chartered Insurance Institute
Insurtech, Cloud and Cybersecurity - Chartered Insurance Institute
 

Recently uploaded

A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessWSO2
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of ProgrammingMatt Welsh
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageGlobus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandIES VE
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareinfo611746
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisNeo4j
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobus
 

Recently uploaded (20)

A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 

Embracing Failure - AzureDay Rome

  • 1. Embracing Failure The art of being at the edge
  • 3. Embracing Failure «Failures are given, and everything will eventually fail over time» (Werner Vogels – CTO Amazon)
  • 5. The art of being at the edge
  • 6. Change Mindset Building a reliable application in the cloud is different than building a reliable application in an enterprise setting A new Mindset is needed.
  • 7. Eight Fallacies of Distributed Computing - The network is reliable - Latency is zero - Bandwidth is infinite - The network is secure - Topology doesn’t exist - There is one administrator - Transport cost is zero - The network is homogeneous Peter Deutsch
  • 9. On-premises Application - Before the Cloud, users were connected to our applications through the Company's local network; - A server's downtime was planned and involved stopping production - Conway’s law model
  • 10. Modern Application - Now our users connect through the Internet - The workload to which our services are subjected will increase significantly, thanks to the greater spread of the applications themselves. - Many Microservices replace Monolithic
  • 11. Microservices: is it really a matter of sizes? We cannot say there is a formal definition of the microservices architectural style, but we can attempt to describe what we see as common characteristics for architectures that fit the label. Common Characteristics Componentisation via services Organised around business capabilities Decentralised data management Products not projects Decentralised governance Smart endpoints and dumb pipes Evolutionary design Infrastructure automation ????????????? (Martin Fowler, James Lewis)
  • 12. Microservices: or a question of Business?
  • 13. Or is it a matter of paradigms? Sync Communication (e.g. http) Async Communication (e.g. ServiceBus) VS
  • 14. Reactive Manifesto (16.01.2014) • (Jones Boner, Dave Farley, Roland Kuhn, Martin Thompson) • The absolute, most import thing is it needs to be responsive. This means that a reactive system needs to remain responsive event when a failure occurs.
  • 15. Responsive “The system responds in a timely manner if at all possible. Responsiveness is the cornerstone of usability and utility, but more than that, responsiveness means that problems may be detected quickly and dealt with effectively.” https://www.reactivemanifesto.org/it
  • 16. Availability Availability Downtime per year Categories 95% (1-nine) 18 days 6 hours Batch processing, Data extraction, Load jobs 99% (2-nines) 3 days 15 hours Internal Tools, Project Tracking 99.9% (3-nines) 8 hours 45 minutes Online Commerce 99.99% (4-nines) 52 minutes Video Delivery, Broadcast systems 99.999% (5-nines) 5 minutes Telecom Industry (ATM Transactions) 99.9999% (6-nines) 31 seconds Answering to me loved one
  • 17. Availability The beauty of Math at work! Component Availability Downtime X 99% (2-nines) 3 days 15 hours Y 99.99% (4-nines) 52 minutes X and Y Combined 98.99% 3 days 16 hours 33 minutes Component Availability Downtime X 99% (2-nines) 3 days 15 hours Two X in parallel 99.99% (4-nines) 52 minutes Three X in parallel 99.9999% (6-nines) 31 seconds
  • 18. Reactive Manifesto - Resilient
  • 19. Resilient • Resilient systems embrace the idea that failures are normal and that it is perfectly acceptable to run systems in what we call partially failing mode.
  • 20. Services resiliency All Azure management services are architected to be resilient from region-level failures. In the spectrum of failures, one or more Availability Zone failures within a region have a smaller failure radius compared to an entire region failure. Azure can recover from a zone-level failure of management services within the region or from another Azure region. Azure performs critical maintenance one zone at a time within a region, to prevent any failures impacting customer resources deployed across Availability Zones within a region. Azure solution • Availability Zones • Zonal services: you pin the resource to a specific zone (for example, virtual machines, managed disks, Standard IP addresses) • Zone-redundant services: platform replicates automatically across zones (for example, zone-redundant storage, SQL Database) What are Availability Zones in Azure?
  • 22. Elastic The degree to which a system is able to adapt to workload changes by provisioning and de-provisioning resources in an autonomic manner, such that at each point in time the available resources match the current demand as closely as possible.
  • 23. • In free and shared service plan, you cannot scale the application as only one instance is available. • In basic plan, you can scale the application manually. This means you have to check the metrics manually to see if more instances are needed and then can increase or decrease them from your Azure management portal. • In standard and premium plan, you can choose to auto scale based on few parameters. Azure solution • The code that we use for scripting (PowerShell or bash) … it’s code. So we have to treat him as such.
  • 24. Reactive Manifesto – Message Driven
  • 25. Guaranteering Delivery - The Two Generals Problem - When we have an unreliable network, which we always do, we cannot guarantee message receipt. - Instead we must be satisfied with either - At Most Once - At Least Once - Exactly Once
  • 26. • Event Grid • Event Hubs • Service Bus Azure solution SERVICE PURPOSE TYPE WHEN TO USE Event Grid Reactive programming Event distribution (discrete) React to status changes Event Hubs Big data pipeline Event streaming (series) Telemetry and distributed data streaming Service Bus High-value enterprise messaging Message Order processing and financial transaction
  • 27. Chaos Engineering Before starting your journey into chaos engineering, make sure you’ve done your homework and have built resiliency into every level of your organization. Building resilient systems isn’t all about software. It starts at the infrastructure layer, progresses to the network and data, influences application design and extends to people and culture. Adrian Hornsby
  • 28. Chaos Engineering - Chaos engineering is a technique to meet the resilience requirement. - Chaos engineering can be use to achieve resilience against - Infrastructure failures - Network failures - Application failures The logo for Chaos Monkey used by Netflix Is the discipline of experimenting on a software system in production in order to build confidence in the system's capability to withstand turbulent and unexpected conditions.
  • 29. Which Chaos Engineering Experiments?
  • 30. The Phases of Chaos Engineering It’s important to understand that chaos engineering is NOT about letting monkeys loose or allowing them to break things randomly without a purpose. Chaos engineering is about breaking things in a controlled environment, through well-planned experiments in order to build confidence in your application to withstand turbulent conditions. https://medium.com/@adhorn/chaos-engineering-ab0cc9fbd12a
  • 31. Canary Deployment Canary deployment: Start small, and slowly build confidence within your team and your organization - How many customers are affected? - What functionality is impaired? - Which locations are imapcted?
  • 32. New Tools One of the most efficient methods for uncovering misalignments in software is put the code together and run it. Continuos Integration was promoted heavily as part of XP methodology as a way to achieve this and is now a common industry norm. Continuos Delivery builds on the success of CI by automated the steps of preparing code and deploying it to an environment. CD tools allow engineers to choose a build that passed the CI stage and promote that through the pipeline to run in production. Like CI/CD, Continuos Verification is born out of a need to navigate increasingly complex systems. Modern organizations can’t validate that the internal machinations of the system work as intended, so instead they verify that the output of the system matches expectations.
  • 33. Benefits of Chaos Engineering - Customer: the increased availability and durability of service means no outages disrupt their day-to-day lives. - Business: Chaos Engineering can help prevent extremely large losses in revenue and maintenance costs, create happier and more engaged engineers, improve in on-call training for engineering teams - Technical: the insights from chaos experiments can mean a reduction in incidents, reduction in on-call burden, increased understanding of system failure modes, improved system design
  • 34. Designed for failure Common Characteristics Componentisation via services Organised around business capabilities Decentralised data management Products not projects Decentralised governance Smart endpoints and dumb pipes Evolutionary design Infrastructure automation designed for failure Chaos Engineering is an experiment to ensure that the impact of failures is mitigated. Adrian Crockcroft
  • 35. Tools don’t create reliability. Human do. @CaseyRosenthal Thank You!!!
  • 36. Tools don’t create reliability. Human do. [But tools can help.] @CaseyRosenthal Thank You!!!
  • 37. • Reactive Manifesto • Asynchronous Message-Based-Communication (Microsoft) • Patterns For Resilient Architecture (Medium) • The Quest for Availability • Chaos Engineering • Availability modes for an Always On availability group • Configure availability group on Azure SQL Server VM manually Resources