SlideShare a Scribd company logo
1 of 6
Download to read offline
What are the risks that may affect the availability of a data center
Availability of a data center means the maximum uptime that the operation of a
data center work without any failure. Availability is determined by a system’s
reliability and it’s recovery time. Understanding that the system downtime can cause
major impact on business entities, it is necessary to know what are the factors that
can impact on data center availability.
Generally these factors can be divided into 4 and listed as below,
• Nature
• Human
• Utility
• Equipment
Nature
This factor is having one of the major impact on availability of a data centers. We
can’t predict the nature of earth which may change any time and cause to a
complete disaster. This will include tornadoes, hurricanes, flooding , earthquakes etc.
Control against the natural calamities by humans are really less hence this can have
a major impact on availability of data center. Maintaining data access in the event of
a disaster can mean the difference between a company’s success or failure. So let us
have a look at some of the incidents that were occurred in various companies and
their data centers.
• Lightning: They say lightning doesn’t strike the same place twice, but in 2015 one of
Google’s European data centers was struck by lightning not once, but four times,
causing errors in 5% of the disks responsible for Google Compute Engine (GCE)
instances. Although the company restored many of the drives, an estimated
0.000001% of data stored in the data center was irrecoverably lost. While that might
not sound like much, try telling that to the customers who were affected by it.
• Hurricanes: According to National Geographic, 2017 was the most expensive
hurricane season in U.S. history, costing roughly $200 billion. With their combination
of high winds, storm surge, and heavy rains, hurricanes are one of the most
dangerous natural disasters data centers must contend with. The sudden flooding
resulting from Hurricane Sandy in 2012 caused extensive data center outages in New
York and New Jersey. These failures were made even worse by the fact that backup
systems were located in the same geographic region and where knocked out by the
same weather event.
• Tornadoes: A devastating 2011 tornado ripped through several hospital buildings in
Joplin, Missouri, one of which was a data center. While none of the data lost was
mission-critical, that was only because most of the information stored there had
been migrated to a new offsite data center just a few weeks earlier. Hospital officials
noted that if the tornado had hit a month earlier, the data loss would have been
catastrophic and rendered the hospital completely inoperable.
• Flooding: Severe flooding in Leeds, UK caused a Vodafone data center to temporarily
lose power during Christmas of 2015. While data loss was negligible, the power
outage disrupted mobile phone service temporarily. Vodafone, of course, has a bit of
history with flooding, having suffered one of the most infamous data center disasters
when its Istanbul data center was devastated by flooding in 2009.
• Earthquakes: So far, data centers have been lucky. Modern architectural standards
and additional precautions (such as special enclosures and rollers for server racks)
have gone a long way towards protecting data centers from earthquakes, even in
high-risk areas.
• The Unexpected…: Disaster planning is all about expecting the unexpected. Take, for
instance, the squirrel that knocked Yahoo’s Santa Clara data center offline for several
hours in 2010, or the truck that drove into a transformer feeding power into a
Backspace data center in 2007.
Human
According to a survey conducted by Aperture Research Institute, human errors are
behind 57.3% of all data center outages. The second most common reason was
improper failover with 43.7%.
Above: Diagram from the Aperture survey.
Let me tell you the another survey details as well,
According Uptime Institute: 70% of DC Outages due to Human Error and not by a fault in
the infrastructure design. Furthermore, “mistakes” that led to an outage can often be
traced to a poor decision by senior management.
The results from both the organization can be different due to the reasons that it may be
conducted on different entities and different environment. As a summary of both of these
surveys we can conclude that the DC outage due to human mistakes are really much higher
than any other dependencies. Let’s take an example of human raised DC issues,
• Activation of the emergency power-off (EPO) switch
• Adjusting the temperature from Fahrenheit to Celsius
• Pulling power cords out of equipment
• Overloading a circuit
• Not following standard policies or procedures
To minimize the risk of the “human factor” affecting operations, it is important to
have up-to-date documentation on everything connected to your data center and
manuals on how different critical operations should be performed. Manuals and
documentation together with scheduled tests should help you avoid many of the
problems and outages described in this survey.
Utility
In the case of a data center the major source of utility is the electric power that is
drawn to data center from local providers(can be a government entity or private
entity). The secondary utility for a data center would be the Diesel generators and
UPS systems. All other mechanical parts related to data center is directly or indirectly
depend on the availability of utility.
An Uptime Institute survey finds the power usage effectiveness of data centers is
better than ever. However it is also true that survey indicates that the power outages
have increased significantly. The Global Data Center Survey report from Uptime
Institute gathered responses from nearly 900 data center operators and IT
practitioners, both from major data center providers and from private, company-
owned data centers(you can download the report from above link).
Even though we do prepare all equipment’s for redundancies there is chances that
these machines may not work as expected at the time of any incidents. One of the
incident that I can get you is that - Diesel rotary uninterruptable power supply
(DRUPS) systems were implicated in power disruptions that in 2014 affected Amazon
Web Services in Sydney, a former Telecity facility called Sovereign House in London,
now owned by Digital Realty Trust, and the Singapore Stock Exchange. Disruption at
Amazon was caused by what the company called “an unusually long voltage sag.”. If
you go through these incident you will understand the root cause of outage is due to
utility failure and subsequent machines failed to start. Some of the incidents that is
reported in data center imminent failure is as below,
• Generator fail to start.
• Generator fails after X number of hours running.
• Utility power partially fails(usually one of three phases- phase loss)
• UPS fails to switch to battery
• UPS fails to switch from battery to input power
From these incidents we can all say that maintaining the periodic checks, preventive
maintenance tasks are really important that would really help a lot to avoid the
impact of failures.
Equipment
As you know the data center infrastructure is a large collection of multiple
equipment and success is depending on the efficiency of all these together. Any
equipment related to electric, mechanical, cooling, networking , servers are having
chances to fail on an unexpected timeframe. Whether it’s a server reaching the end
of its five-year expected lifespan or a UPS backup battery dying before it should,
equipment failure is one of the most common causes of data center outages.
With today’s powerful data center infrastructure management (DCIM) tools, facilities
can monitor the overall health of their own equipment as well as colocated assets.
While it may not be possible to predict every failure, sophisticated algorithms can
monitor equipment performance continually to anticipate when hardware is
reaching the end of its lifecycle or is prone to break down. When these problems are
identified, data center personnel can plan to switch out faulty or outdated
equipment without having to take critical systems offline. With the
right redundancies and backups and emergency spares, in place, even an unexpected
failure can be managed without compromising network performance.
Source : www.vxchnge.com & www.pingdom.com
Have a comment or points to be review? Knowledge is power and it increases by
sharing. Feel free to comment.

More Related Content

Similar to What are the risks that may affect the availability of a data center

Will You Be Prepared When The Next Disaster Strikes - Whitepaper
Will You Be Prepared When The Next Disaster Strikes - WhitepaperWill You Be Prepared When The Next Disaster Strikes - Whitepaper
Will You Be Prepared When The Next Disaster Strikes - WhitepaperChristian Caracciolo
 
DistribuTECH 2016: OMNETRIC next generation outage management
DistribuTECH 2016: OMNETRIC next generation outage managementDistribuTECH 2016: OMNETRIC next generation outage management
DistribuTECH 2016: OMNETRIC next generation outage managementOMNETRIC
 
Earthlink Business Cloud Disaster Recovery
Earthlink Business Cloud Disaster RecoveryEarthlink Business Cloud Disaster Recovery
Earthlink Business Cloud Disaster RecoveryMike Ricca
 
Critical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar N
Critical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar NCritical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar N
Critical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar Nnull The Open Security Community
 
Null Feb 13
Null Feb 13Null Feb 13
Null Feb 13Sundar N
 
Mastering disaster e book Telehouse
Mastering disaster e book TelehouseMastering disaster e book Telehouse
Mastering disaster e book TelehouseTelehouse
 
E guide weathering the storm at your business
E guide weathering the storm at your businessE guide weathering the storm at your business
E guide weathering the storm at your businessSoma Technology Group
 
Datacenter Infrastructure Security
Datacenter Infrastructure SecurityDatacenter Infrastructure Security
Datacenter Infrastructure SecurityJoseph Halford
 
V mware business trend brief - crash insurance - protect your business with...
V mware   business trend brief - crash insurance - protect your business with...V mware   business trend brief - crash insurance - protect your business with...
V mware business trend brief - crash insurance - protect your business with...VMware_EMEA
 
Business continuity overview
Business continuity overviewBusiness continuity overview
Business continuity overviewRod Davis
 
Wide area protection-and_emergency_control (1)
Wide area protection-and_emergency_control (1)Wide area protection-and_emergency_control (1)
Wide area protection-and_emergency_control (1)Alaa Eladl
 
Kpacket 2014 Top_Ten_Guide
Kpacket 2014 Top_Ten_GuideKpacket 2014 Top_Ten_Guide
Kpacket 2014 Top_Ten_GuideAPEX Global
 
IRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud Computing
IRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud ComputingIRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud Computing
IRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud ComputingIRJET Journal
 
SANOG34-Tutorials-datacentre.pptx
SANOG34-Tutorials-datacentre.pptxSANOG34-Tutorials-datacentre.pptx
SANOG34-Tutorials-datacentre.pptxssuserb3632e
 
European Utility Week 2015: Next Generation Outage Management
European Utility Week 2015: Next Generation Outage ManagementEuropean Utility Week 2015: Next Generation Outage Management
European Utility Week 2015: Next Generation Outage ManagementOMNETRIC
 
Cyber Security for SCADA
Cyber Security for SCADACyber Security for SCADA
Cyber Security for SCADARichard Umbrino
 
Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...
Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...
Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...SolarWinds
 
Successful_BC_Strategy.pdf
Successful_BC_Strategy.pdfSuccessful_BC_Strategy.pdf
Successful_BC_Strategy.pdfmykovalenko1
 

Similar to What are the risks that may affect the availability of a data center (20)

Will You Be Prepared When The Next Disaster Strikes - Whitepaper
Will You Be Prepared When The Next Disaster Strikes - WhitepaperWill You Be Prepared When The Next Disaster Strikes - Whitepaper
Will You Be Prepared When The Next Disaster Strikes - Whitepaper
 
DistribuTECH 2016: OMNETRIC next generation outage management
DistribuTECH 2016: OMNETRIC next generation outage managementDistribuTECH 2016: OMNETRIC next generation outage management
DistribuTECH 2016: OMNETRIC next generation outage management
 
Earthlink Business Cloud Disaster Recovery
Earthlink Business Cloud Disaster RecoveryEarthlink Business Cloud Disaster Recovery
Earthlink Business Cloud Disaster Recovery
 
Critical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar N
Critical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar NCritical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar N
Critical Infrastructure Security Talk At Null Bangalore 13 Feb 2010 Sundar N
 
Null Feb 13
Null Feb 13Null Feb 13
Null Feb 13
 
Mastering disaster e book Telehouse
Mastering disaster e book TelehouseMastering disaster e book Telehouse
Mastering disaster e book Telehouse
 
E guide weathering the storm at your business
E guide weathering the storm at your businessE guide weathering the storm at your business
E guide weathering the storm at your business
 
Datacenter Infrastructure Security
Datacenter Infrastructure SecurityDatacenter Infrastructure Security
Datacenter Infrastructure Security
 
V mware business trend brief - crash insurance - protect your business with...
V mware   business trend brief - crash insurance - protect your business with...V mware   business trend brief - crash insurance - protect your business with...
V mware business trend brief - crash insurance - protect your business with...
 
Business continuity overview
Business continuity overviewBusiness continuity overview
Business continuity overview
 
Wide area protection-and_emergency_control (1)
Wide area protection-and_emergency_control (1)Wide area protection-and_emergency_control (1)
Wide area protection-and_emergency_control (1)
 
Long form final
Long form finalLong form final
Long form final
 
Disaster Recovery
Disaster RecoveryDisaster Recovery
Disaster Recovery
 
Kpacket 2014 Top_Ten_Guide
Kpacket 2014 Top_Ten_GuideKpacket 2014 Top_Ten_Guide
Kpacket 2014 Top_Ten_Guide
 
IRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud Computing
IRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud ComputingIRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud Computing
IRJET-Comparative Analysis of Disaster Recovery Solutions in Cloud Computing
 
SANOG34-Tutorials-datacentre.pptx
SANOG34-Tutorials-datacentre.pptxSANOG34-Tutorials-datacentre.pptx
SANOG34-Tutorials-datacentre.pptx
 
European Utility Week 2015: Next Generation Outage Management
European Utility Week 2015: Next Generation Outage ManagementEuropean Utility Week 2015: Next Generation Outage Management
European Utility Week 2015: Next Generation Outage Management
 
Cyber Security for SCADA
Cyber Security for SCADACyber Security for SCADA
Cyber Security for SCADA
 
Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...
Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...
Federal Webinar: Slow is the New Broke: Improving Government Efficiency with ...
 
Successful_BC_Strategy.pdf
Successful_BC_Strategy.pdfSuccessful_BC_Strategy.pdf
Successful_BC_Strategy.pdf
 

More from Livin Jose

Data center cooling infrastructure slide
Data center cooling infrastructure slideData center cooling infrastructure slide
Data center cooling infrastructure slideLivin Jose
 
Data center power infrastructure
Data center power infrastructureData center power infrastructure
Data center power infrastructureLivin Jose
 
Compliance policies and procedures followed in data centers
Compliance policies and procedures followed in data centersCompliance policies and procedures followed in data centers
Compliance policies and procedures followed in data centersLivin Jose
 
What are cloud service models
What are cloud service modelsWhat are cloud service models
What are cloud service modelsLivin Jose
 
What are the types of cloud computing
What are the types of cloud computingWhat are the types of cloud computing
What are the types of cloud computingLivin Jose
 
Data center power availability provisioning
Data center power availability provisioningData center power availability provisioning
Data center power availability provisioningLivin Jose
 
What is data center availability modes slide
What is data center availability modes slideWhat is data center availability modes slide
What is data center availability modes slideLivin Jose
 
What is a data center
What is a data centerWhat is a data center
What is a data centerLivin Jose
 
What are the types of data centers
What are the types of data centersWhat are the types of data centers
What are the types of data centersLivin Jose
 

More from Livin Jose (9)

Data center cooling infrastructure slide
Data center cooling infrastructure slideData center cooling infrastructure slide
Data center cooling infrastructure slide
 
Data center power infrastructure
Data center power infrastructureData center power infrastructure
Data center power infrastructure
 
Compliance policies and procedures followed in data centers
Compliance policies and procedures followed in data centersCompliance policies and procedures followed in data centers
Compliance policies and procedures followed in data centers
 
What are cloud service models
What are cloud service modelsWhat are cloud service models
What are cloud service models
 
What are the types of cloud computing
What are the types of cloud computingWhat are the types of cloud computing
What are the types of cloud computing
 
Data center power availability provisioning
Data center power availability provisioningData center power availability provisioning
Data center power availability provisioning
 
What is data center availability modes slide
What is data center availability modes slideWhat is data center availability modes slide
What is data center availability modes slide
 
What is a data center
What is a data centerWhat is a data center
What is a data center
 
What are the types of data centers
What are the types of data centersWhat are the types of data centers
What are the types of data centers
 

Recently uploaded

Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 

Recently uploaded (20)

Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 

What are the risks that may affect the availability of a data center

  • 1. What are the risks that may affect the availability of a data center Availability of a data center means the maximum uptime that the operation of a data center work without any failure. Availability is determined by a system’s reliability and it’s recovery time. Understanding that the system downtime can cause major impact on business entities, it is necessary to know what are the factors that can impact on data center availability. Generally these factors can be divided into 4 and listed as below, • Nature • Human • Utility • Equipment Nature This factor is having one of the major impact on availability of a data centers. We can’t predict the nature of earth which may change any time and cause to a complete disaster. This will include tornadoes, hurricanes, flooding , earthquakes etc. Control against the natural calamities by humans are really less hence this can have a major impact on availability of data center. Maintaining data access in the event of a disaster can mean the difference between a company’s success or failure. So let us have a look at some of the incidents that were occurred in various companies and their data centers. • Lightning: They say lightning doesn’t strike the same place twice, but in 2015 one of Google’s European data centers was struck by lightning not once, but four times, causing errors in 5% of the disks responsible for Google Compute Engine (GCE) instances. Although the company restored many of the drives, an estimated 0.000001% of data stored in the data center was irrecoverably lost. While that might not sound like much, try telling that to the customers who were affected by it. • Hurricanes: According to National Geographic, 2017 was the most expensive hurricane season in U.S. history, costing roughly $200 billion. With their combination of high winds, storm surge, and heavy rains, hurricanes are one of the most dangerous natural disasters data centers must contend with. The sudden flooding
  • 2. resulting from Hurricane Sandy in 2012 caused extensive data center outages in New York and New Jersey. These failures were made even worse by the fact that backup systems were located in the same geographic region and where knocked out by the same weather event. • Tornadoes: A devastating 2011 tornado ripped through several hospital buildings in Joplin, Missouri, one of which was a data center. While none of the data lost was mission-critical, that was only because most of the information stored there had been migrated to a new offsite data center just a few weeks earlier. Hospital officials noted that if the tornado had hit a month earlier, the data loss would have been catastrophic and rendered the hospital completely inoperable. • Flooding: Severe flooding in Leeds, UK caused a Vodafone data center to temporarily lose power during Christmas of 2015. While data loss was negligible, the power outage disrupted mobile phone service temporarily. Vodafone, of course, has a bit of history with flooding, having suffered one of the most infamous data center disasters when its Istanbul data center was devastated by flooding in 2009. • Earthquakes: So far, data centers have been lucky. Modern architectural standards and additional precautions (such as special enclosures and rollers for server racks) have gone a long way towards protecting data centers from earthquakes, even in high-risk areas. • The Unexpected…: Disaster planning is all about expecting the unexpected. Take, for instance, the squirrel that knocked Yahoo’s Santa Clara data center offline for several hours in 2010, or the truck that drove into a transformer feeding power into a Backspace data center in 2007. Human According to a survey conducted by Aperture Research Institute, human errors are behind 57.3% of all data center outages. The second most common reason was improper failover with 43.7%.
  • 3. Above: Diagram from the Aperture survey. Let me tell you the another survey details as well, According Uptime Institute: 70% of DC Outages due to Human Error and not by a fault in the infrastructure design. Furthermore, “mistakes” that led to an outage can often be traced to a poor decision by senior management. The results from both the organization can be different due to the reasons that it may be conducted on different entities and different environment. As a summary of both of these surveys we can conclude that the DC outage due to human mistakes are really much higher than any other dependencies. Let’s take an example of human raised DC issues, • Activation of the emergency power-off (EPO) switch • Adjusting the temperature from Fahrenheit to Celsius • Pulling power cords out of equipment • Overloading a circuit • Not following standard policies or procedures To minimize the risk of the “human factor” affecting operations, it is important to have up-to-date documentation on everything connected to your data center and manuals on how different critical operations should be performed. Manuals and
  • 4. documentation together with scheduled tests should help you avoid many of the problems and outages described in this survey. Utility In the case of a data center the major source of utility is the electric power that is drawn to data center from local providers(can be a government entity or private entity). The secondary utility for a data center would be the Diesel generators and UPS systems. All other mechanical parts related to data center is directly or indirectly depend on the availability of utility. An Uptime Institute survey finds the power usage effectiveness of data centers is better than ever. However it is also true that survey indicates that the power outages have increased significantly. The Global Data Center Survey report from Uptime Institute gathered responses from nearly 900 data center operators and IT practitioners, both from major data center providers and from private, company- owned data centers(you can download the report from above link). Even though we do prepare all equipment’s for redundancies there is chances that these machines may not work as expected at the time of any incidents. One of the incident that I can get you is that - Diesel rotary uninterruptable power supply (DRUPS) systems were implicated in power disruptions that in 2014 affected Amazon Web Services in Sydney, a former Telecity facility called Sovereign House in London, now owned by Digital Realty Trust, and the Singapore Stock Exchange. Disruption at Amazon was caused by what the company called “an unusually long voltage sag.”. If you go through these incident you will understand the root cause of outage is due to utility failure and subsequent machines failed to start. Some of the incidents that is reported in data center imminent failure is as below, • Generator fail to start. • Generator fails after X number of hours running. • Utility power partially fails(usually one of three phases- phase loss) • UPS fails to switch to battery • UPS fails to switch from battery to input power From these incidents we can all say that maintaining the periodic checks, preventive maintenance tasks are really important that would really help a lot to avoid the impact of failures.
  • 5. Equipment As you know the data center infrastructure is a large collection of multiple equipment and success is depending on the efficiency of all these together. Any equipment related to electric, mechanical, cooling, networking , servers are having chances to fail on an unexpected timeframe. Whether it’s a server reaching the end of its five-year expected lifespan or a UPS backup battery dying before it should, equipment failure is one of the most common causes of data center outages. With today’s powerful data center infrastructure management (DCIM) tools, facilities can monitor the overall health of their own equipment as well as colocated assets. While it may not be possible to predict every failure, sophisticated algorithms can monitor equipment performance continually to anticipate when hardware is reaching the end of its lifecycle or is prone to break down. When these problems are identified, data center personnel can plan to switch out faulty or outdated equipment without having to take critical systems offline. With the right redundancies and backups and emergency spares, in place, even an unexpected failure can be managed without compromising network performance. Source : www.vxchnge.com & www.pingdom.com
  • 6. Have a comment or points to be review? Knowledge is power and it increases by sharing. Feel free to comment.