SlideShare a Scribd company logo
1 of 11
Availability-Performance testing
to any website all over the world
JOHN POURDANIS
Why select cloud services for testing?
Why Application Insights?
Microsoft Azure regions
Does performance matter?
Performance/Load Testing
Performance testing is usually done to help identify
bottlenecks in a system or to establish a baseline for future
testing. It is also done to ensure compliance with performance
goals and requirements, and to help stakeholders make
informed decisions related to the overall quality of the
application being tested.
Speed - Determines whether the application responds quickly
Scalability - Determines maximum user load the software application can handle.
Stability - Determines if the application is stable under varying loads
What does it check?
Applications in the modern world rely and depend on multiple servers,
internal and external network communications, database services,
operational processes, and a host of other infrastructure services that
must all work uniformly and together. Here, the business ideal is a
continuous flow of information — and any interruption costs the
company money; hence, creating high-availability applications becomes
an important business strategy.
As a general idea, availability is a measure of how often the
application is available for use. More specifically, availability is a
percentage calculation based on how often the application is actually
available to handle service requests when compared to the total,
planned, available runtime. The formal calculation of availability
includes repair time because an application that is being repaired is
not available for use.
Availability Testing
How we count Availability of a website
• Mean Time Between Failure :
Average length of time the
application runs before failing.
• Mean Time To Recovery :
Average length of time needed to
repair and restore service after a
failure.
Parameter Acronym Calculation
Mean Time
Between
Failure
MTBF
Hours /
Failure
Count
Mean Time
To Recovery
MTTR
Repair
Hours /
Failure
Count
Availability = (MTBF / (MTBF + MTTR)) X 100
One popular way to describe availability is by the “nines,” such as three
nines for 99.9% availability. However, the implication of measuring by nines
is sometimes misunderstood. You need to do the arithmetic to discover that
three nines (99.9% availability) represents about 8.5 hours of service outage
in a single year. The next level up, four nines (99.99%), represents about 1
hour of service outage in a single year. Five nines (99.999%) represents only
about 5 minutes of outage per year.
The tricky part of “Nines”
For example, consider an application that is intended to run
perpetually. If we assume 1000 continuous hours as a checkpoint, two
1-hour failures during that time would result in availability of ((1000/2)
/ ((1000/2) + 1)) X 100 = (500 / 501) X 100 = .998 X 100 = 99.8%.
Currently, Netflix uses a service called “Chaos Monkey” to simulate service
failure. Basically, Chaos Monkey is a service that kills other services. We
run this service because we want engineering teams to be used to a
constant level of failure in the cloud. Services should automatically
recover without any manual intervention. We don’t however, simulate
what happens when an entire AZ goes down and therefore we haven’t
engineered our systems to automatically deal with those sorts of failures.
Internally we are having discussions about doing that and people are
already starting to call this service “Chaos Gorilla”.
The example of Netflix
Reference links
https://docs.microsoft.com/en-us/azure/application-insights/app-
insights-monitor-web-app-availability
https://www.qualitestgroup.com/white-papers/introduction-availability-
testing/
https://medium.com/netflix-techblog/lessons-netflix-learned-from-the-
aws-outage-deefe5fd0c04
Thank you

More Related Content

Similar to Availability performance testing with Application Insights.

Leveraging Cloud for Product Testing- Impetus White Paper
Leveraging Cloud for Product Testing- Impetus White PaperLeveraging Cloud for Product Testing- Impetus White Paper
Leveraging Cloud for Product Testing- Impetus White PaperImpetus Technologies
 
White paper on testing in cloud
White paper on testing in cloudWhite paper on testing in cloud
White paper on testing in cloudimkulu
 
PerformanceTestingWithLoadrunner
PerformanceTestingWithLoadrunnerPerformanceTestingWithLoadrunner
PerformanceTestingWithLoadrunnertechgajanan
 
Performance Testing With Loadrunner
Performance Testing With LoadrunnerPerformance Testing With Loadrunner
Performance Testing With Loadrunnervladimir zaremba
 
Design patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applicationsDesign patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applicationsHimanshu Sahu
 
On designing and deploying internet scale services
On designing and deploying internet scale servicesOn designing and deploying internet scale services
On designing and deploying internet scale servicesbillowqiu
 
An Introduction to Designing Reliable Cloud Services January 2014
An Introduction to Designing Reliable Cloud Services January 2014An Introduction to Designing Reliable Cloud Services January 2014
An Introduction to Designing Reliable Cloud Services January 2014David J Rosenthal
 
Top-Down Network DesignAnalyzing Technical Goals.docx
Top-Down Network DesignAnalyzing Technical Goals.docxTop-Down Network DesignAnalyzing Technical Goals.docx
Top-Down Network DesignAnalyzing Technical Goals.docxjuliennehar
 
Disaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an EmergencyDisaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an Emergencysco813f8ko
 
Tef con2016 (1)
Tef con2016 (1)Tef con2016 (1)
Tef con2016 (1)ggarber
 
7_Questions_DR_Plan_6-23-16
7_Questions_DR_Plan_6-23-167_Questions_DR_Plan_6-23-16
7_Questions_DR_Plan_6-23-16Peak 10
 
Successful_BC_Strategy.pdf
Successful_BC_Strategy.pdfSuccessful_BC_Strategy.pdf
Successful_BC_Strategy.pdfmykovalenko1
 
Performance Testing using LoadRunner
Performance Testing using LoadRunnerPerformance Testing using LoadRunner
Performance Testing using LoadRunnerKumar Gupta
 
Victor Chang: Cloud computing business framework
Victor Chang: Cloud computing business frameworkVictor Chang: Cloud computing business framework
Victor Chang: Cloud computing business frameworkCBOD ANR project U-PSUD
 
An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...
An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...
An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...ijafrc
 
Manage the Digital Transformation with Machine Learning in a Reactive Microse...
Manage the Digital Transformation with Machine Learning in a Reactive Microse...Manage the Digital Transformation with Machine Learning in a Reactive Microse...
Manage the Digital Transformation with Machine Learning in a Reactive Microse...DataWorks Summit
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...ijgca
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...ijgca
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...ijgca
 

Similar to Availability performance testing with Application Insights. (20)

Leveraging Cloud for Product Testing- Impetus White Paper
Leveraging Cloud for Product Testing- Impetus White PaperLeveraging Cloud for Product Testing- Impetus White Paper
Leveraging Cloud for Product Testing- Impetus White Paper
 
CLOUD TESTING MODEL – BENEFITS, LIMITATIONS AND CHALLENGES
CLOUD TESTING MODEL – BENEFITS, LIMITATIONS AND CHALLENGESCLOUD TESTING MODEL – BENEFITS, LIMITATIONS AND CHALLENGES
CLOUD TESTING MODEL – BENEFITS, LIMITATIONS AND CHALLENGES
 
White paper on testing in cloud
White paper on testing in cloudWhite paper on testing in cloud
White paper on testing in cloud
 
PerformanceTestingWithLoadrunner
PerformanceTestingWithLoadrunnerPerformanceTestingWithLoadrunner
PerformanceTestingWithLoadrunner
 
Performance Testing With Loadrunner
Performance Testing With LoadrunnerPerformance Testing With Loadrunner
Performance Testing With Loadrunner
 
Design patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applicationsDesign patterns and plan for developing high available azure applications
Design patterns and plan for developing high available azure applications
 
On designing and deploying internet scale services
On designing and deploying internet scale servicesOn designing and deploying internet scale services
On designing and deploying internet scale services
 
An Introduction to Designing Reliable Cloud Services January 2014
An Introduction to Designing Reliable Cloud Services January 2014An Introduction to Designing Reliable Cloud Services January 2014
An Introduction to Designing Reliable Cloud Services January 2014
 
Top-Down Network DesignAnalyzing Technical Goals.docx
Top-Down Network DesignAnalyzing Technical Goals.docxTop-Down Network DesignAnalyzing Technical Goals.docx
Top-Down Network DesignAnalyzing Technical Goals.docx
 
Disaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an EmergencyDisaster Recovery: Develop Efficient Critique for an Emergency
Disaster Recovery: Develop Efficient Critique for an Emergency
 
Tef con2016 (1)
Tef con2016 (1)Tef con2016 (1)
Tef con2016 (1)
 
7_Questions_DR_Plan_6-23-16
7_Questions_DR_Plan_6-23-167_Questions_DR_Plan_6-23-16
7_Questions_DR_Plan_6-23-16
 
Successful_BC_Strategy.pdf
Successful_BC_Strategy.pdfSuccessful_BC_Strategy.pdf
Successful_BC_Strategy.pdf
 
Performance Testing using LoadRunner
Performance Testing using LoadRunnerPerformance Testing using LoadRunner
Performance Testing using LoadRunner
 
Victor Chang: Cloud computing business framework
Victor Chang: Cloud computing business frameworkVictor Chang: Cloud computing business framework
Victor Chang: Cloud computing business framework
 
An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...
An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...
An Analysis Of Cloud ReliabilityApproaches Based on Cloud Components And Reli...
 
Manage the Digital Transformation with Machine Learning in a Reactive Microse...
Manage the Digital Transformation with Machine Learning in a Reactive Microse...Manage the Digital Transformation with Machine Learning in a Reactive Microse...
Manage the Digital Transformation with Machine Learning in a Reactive Microse...
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
 
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
SERVICE LEVEL AGREEMENT BASED FAULT TOLERANT WORKLOAD SCHEDULING IN CLOUD COM...
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 

Availability performance testing with Application Insights.

  • 1. Availability-Performance testing to any website all over the world JOHN POURDANIS
  • 2. Why select cloud services for testing?
  • 6. Performance/Load Testing Performance testing is usually done to help identify bottlenecks in a system or to establish a baseline for future testing. It is also done to ensure compliance with performance goals and requirements, and to help stakeholders make informed decisions related to the overall quality of the application being tested. Speed - Determines whether the application responds quickly Scalability - Determines maximum user load the software application can handle. Stability - Determines if the application is stable under varying loads What does it check?
  • 7. Applications in the modern world rely and depend on multiple servers, internal and external network communications, database services, operational processes, and a host of other infrastructure services that must all work uniformly and together. Here, the business ideal is a continuous flow of information — and any interruption costs the company money; hence, creating high-availability applications becomes an important business strategy. As a general idea, availability is a measure of how often the application is available for use. More specifically, availability is a percentage calculation based on how often the application is actually available to handle service requests when compared to the total, planned, available runtime. The formal calculation of availability includes repair time because an application that is being repaired is not available for use. Availability Testing
  • 8. How we count Availability of a website • Mean Time Between Failure : Average length of time the application runs before failing. • Mean Time To Recovery : Average length of time needed to repair and restore service after a failure. Parameter Acronym Calculation Mean Time Between Failure MTBF Hours / Failure Count Mean Time To Recovery MTTR Repair Hours / Failure Count Availability = (MTBF / (MTBF + MTTR)) X 100
  • 9. One popular way to describe availability is by the “nines,” such as three nines for 99.9% availability. However, the implication of measuring by nines is sometimes misunderstood. You need to do the arithmetic to discover that three nines (99.9% availability) represents about 8.5 hours of service outage in a single year. The next level up, four nines (99.99%), represents about 1 hour of service outage in a single year. Five nines (99.999%) represents only about 5 minutes of outage per year. The tricky part of “Nines” For example, consider an application that is intended to run perpetually. If we assume 1000 continuous hours as a checkpoint, two 1-hour failures during that time would result in availability of ((1000/2) / ((1000/2) + 1)) X 100 = (500 / 501) X 100 = .998 X 100 = 99.8%.
  • 10. Currently, Netflix uses a service called “Chaos Monkey” to simulate service failure. Basically, Chaos Monkey is a service that kills other services. We run this service because we want engineering teams to be used to a constant level of failure in the cloud. Services should automatically recover without any manual intervention. We don’t however, simulate what happens when an entire AZ goes down and therefore we haven’t engineered our systems to automatically deal with those sorts of failures. Internally we are having discussions about doing that and people are already starting to call this service “Chaos Gorilla”. The example of Netflix