Risk Assessments and Reliability, What You Need To Know

1,351 views

Published on

By the end of this presentation the attendees will understand the need for an Infrastructure Reliability and Risk Assessment for their critical environment as well as what types of systems should be included in the evaluation, how the evaluation should be performed to ensure tangible results, how it should be reported and ultimately how to interpret and utilize the information presented in the assessment to their advantage.

Presentation Outline
1. What is an Infrastructure Reliability and Risk Assessment and what do I need one for?
2. Who should perform an Infrastructure Reliability and Risk Assessment.
3. What information should be included in an Infrastructure Reliability and Risk Assessment.
4. What building systems should be included. This will be an infrastructure system by system approach.
5. What are the key things to look for when my study is complete?
A. Reliability Level.
B. Single Points of Failure within Critical Systems.
C. Redundancy of Critical Systems.
D. System Integration.
E. Adequacy of Engineered Systems (Exhaust Points).
F. Adequacy of Operations, Maintenance and Testing Programs.
G. Benchmark Findings with Industry Standards.
6. Availability, MTBF Calculations and Probability of Failure Calculations. What are they, who does them, what do they mean?
7. Computational fluid dynamic modeling.
8. How long should a study like this take?
9. Review of a sample study.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,351
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
54
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Risk Assessments and Reliability, What You Need To Know

  1. 1. INFRASTRUCTURERELIABILITY ANDRISKASSESSMENTS Steven Shapiro, P.E., ATD Mission Critical Practice Lead Morrison Hershfield Mission Critical Morrison Hershfield Mission Critical
  2. 2. WHAT YOU NEED TO KNOWAGENDA• RISK ASSESSMENT• INFRASTRUCTURE RELIABILITY COOLING POWER Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  3. 3. RISK ASSESSMENTS• WHY• SITE EVALUATION• METRICS Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  4. 4. Causes of Critical Failures • Location • Design • Redundancy level • Construction • Quality of equipment • Age Lurking Vulnerabilities • Operations & Maintenance program • Personnel training • Level of operator coverage • Thoroughness of the commissioning program 5 Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments WHY
  5. 5. Causes of Critical Failures• Equipment failure• Operator error• Natural disaster• Design error• Installation error• Commissioning or test deficiency• Maintenance oversight• Equipment design WHY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  6. 6. Causes of Critical Failures• Root cause not always easy to ascertain• Combination of factors (Cascading Failures)• Latent failures• Most occur during change of state events• More maintenance does not necessarily mean higher availability• Non-Fault tolerant systems WHY FILURES Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  7. 7. Causes of Critical Failures Commissioning or Test Deficiency 4% System Design Equipment Natural Disaster 20% Design 3% 13% Maintenance Oversight 4% Equipment Failure 28% Installation Error 10% Human Error 18% WHY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessment
  8. 8. WHY DO RISK ASSESSMENT• Alignment of business mission and facility performance expectation• Quantifies the risk and exposure of the critical facilities to failure• Identifies vulnerabilities and single points of failure• First step in creating an action plan for site hardening• Benchmark against the industry• Assists in developing business case for capital expenditures RISK ASSESSMENT Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  9. 9. SITE EVALUATIONSTEP 1• Quantify reliability expectations• Develop resiliency metrics RISK ASSESSMENT Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  10. 10. SITE EVALUATIONSTEP 2 • Develop PRA model (Probabilistic Risk Assessment) • Identify Single Points of Failure within critical systems • Evaluate redundancy of critical systems • Capacity and expendability analysis • Adequacy of Engineered Systems • Operation and maintenance policies, practices and procedures • Adequacy of maintenance and testing programs • Evaluate risks associated with site location • Overall Risk Analysis • Evaluate the adequacy of operations and maintenance programs RISK ASSESSMENT Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  11. 11. SITE EVALUATIONSTEP 2 cont.• Harmonics analysis• EMF studies• Short circuit & coordination studies• Air flow modeling-CFD RISK ASSESSMENT Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  12. 12. SITE EVALUATIONSTEP 3 • Perform gap analysisSTEP 4 • Recommendations for upgrade/alteration to optimize facility performance • Budget and schedule development • Assess risk during implementation • Benchmark findings with industry standards RISK ASSESSMENT Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  13. 13. RISK ASSESSMENT METRICS • Probability of Failure/Reliability • Availability • MTTF • MTTR • Susceptibility to natural disasters • Fault tolerance • Single Points of Failure • Maintainability • Operational readiness • Maintenance program RISK ASSESSMENT Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  14. 14. INFRASTRUCTURE RELIABILITY • RELIABILITY / AVAILABLITY • RELIABILITY MODELING • RELIABILITY CONSIDERATIONS RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  15. 15. RELIABILITY• “Reliability” is used as an umbrella definition• May Refer to Availability, Durability, Quality• Five 9’s ????• Reliability = Probability of Successful Operation RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  16. 16. RELIABILITY AND AVAILABILITY• Reliability predicts how likely is the system to fail.• Availability is a measure (or a future prediction) of what percentage of the time the system will operating properly RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  17. 17. AVAILABILITYFive 9’s refers to AvailabilityAvailability (A) = Average fraction of time Something is in serviceand performing intended function.99.999% availability means: • 5.3 minutes of downtime each year or • 1.77 hours of downtime every 20 yearsAvailability does not specify how often an outage occurs RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  18. 18. AVAILABILITYAvailability (A) = MTBF/(MTBF + MTTR) MTTF: Mean Time To Failure MTBF: Mean Time Between Failures MTTR: Mean Time to Repair or Downtime MTBF=MTTF+MTTR RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  19. 19. RELIABILITY BATHTUB CURVE Failure Rate early wear-out life useful life period 0.5 Time (t) Years YEARS 12 14 RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  20. 20. RELIABILITY MODELING• Used to compare system designs and assist in the evaluation of risk versus the cost to mitigate the risk.• Failure and Repair data comes from IEEE 493, Recommended Practice for Design of Reliable Industrial and Commercial Power Systems (IEEE Gold Book) RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  21. 21. RELIABILITY MODELINGComponents used for reliability modeling of the electrical system shownhere:• Utility power• Generator• Circuit breakers• Switchboards• Cables• Automatic Transfer Switch• UPS module• Battery• Static Bypass Switch• Rack Power RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  22. 22. RELIABILITY MODELING Reliability Block  Diagram (RBD) RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  23. 23. RELIABILITY MODELINGShown below are the results of the calculations Hours Hours RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  24. 24. THE TRADITIONAL CLASSIFICATION SYSTEM The Uptime InstituteTier 1 – Basic Non-Redundant Data Center Single path for power and cooling distribution without redundant componentsTier 2 – Basic Redundant Data Center Single path for power and cooling distribution with redundant componentsTier 3 – Concurrently Maintainable Data Center Multiple paths for power and cooling distribution with only one path active and with redundant componentsTier 4 – Fault Tolerant Data Center Multiple active power and cooling distribution paths with redundant components and fault tolerantRELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  25. 25. Tier Definitions TIER REQUIREMENTS Tier I Tier II Tier III Tier IV 1 ActiveNumber of Delivery Paths 1 1 2 Active 1 PassiveRedundancy N N+1 N+1 2N MinimumCompartmentalization No No No YesConcurrent Maintainability No No Yes YesFault Tolerance No No No YesAvailability 99.67 99.75 99.982 99.95Downtime in Hr/Yr 28.8 22 1.6 0.4 RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  26. 26. Data Center CostFrom the UI• Tier I - $10,000 US/kW of Useable UPS Power Output• Tier II - $11,000 US/kW of Useable UPS Power Output• Tier III - $20,000 US/kW of Useable UPS Power Output• Tier IV - $22,000 US/kW of Useable UPS Power Output• Plus $225 US/SF of Computer Room RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  27. 27. HOW MUCH REDUNDANCY IS ENOUGH?RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  28. 28. Reliability ConsiderationsAssumptions• Various configurations examined for single or dual utility feeders, UPS, Generators, STS’s, single or dual cords• Compare Reliability at 2000 KW and 4000 KW Load• 5 Year Probability of Failure RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  29. 29. Single utility feeder, parallel redundant UPS and generators, single cord IT equipment
  30. 30. 2N UPS, N+1 Generators, ASTSs, Dual Cord Rack
  31. 31. Two Utility Feeders, 2(N+1) UPS, 2(N+1) Generators, ASTSs, Dual Cord Rack
  32. 32. Distributed Redundant UPS, N+2 Generators, Two Utility Feeders, ASTSs and Dual Cord Rack
  33. 33. Reliability ConsiderationsRELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  34. 34. Reliability Considerations RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  35. 35. Reliability Considerations RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  36. 36. Reliability Considerations RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  37. 37. Reliability Considerations RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  38. 38. Reliability ConsiderationsEmergency Diesel Generators fail to start fail after ½ hour fail after 8 hours fail after 24 hoursStudy Performed by Idaho National Engineering Laboratory – February 1996 at Nuclear Power Plants RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  39. 39. Reliability Considerations• 2(N+1) UPS/Generator with dual utility feeders - most reliable topology• 2(N+1) UPS > 2N UPS by small margin• 2N > Distributed Redundant by small margin• Significant improvement if a second utility feeder is provided• N+2 and/or 2N generator systems are more reliable than N+1• Hybrid configuration in a hybrid facility is sometimes the best solution RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  40. 40. Reliability Considerations• Assess the condition of the mechanical plant in conjunction with the electrical system• The facility reliability will be driven by the least reliable component (typically the electrical infrastructure) RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  41. 41. System Reliability Block Electrical System Electrical Mechanical Electrical systempow ering the Mechanical systemsupporting critical critical load load RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  42. 42. System Reliability Block MTBF Availability Pf (3 years)Electrical systemalone 330,184 0.99999 8.10%Mechanical systemalone 178,611 0.999943 11.70%Electrical systemsupporting mechanical 108,500 0.999985 21.40%Overall mechanicalsystem 70,087 0.999931 29.20%Combined electricalmechanical system 57,819 0.999922 36.90% Electrical System Electrical Mechanical Electrical system powering the Mechanical system supporting critical critical load load RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  43. 43. The Cost of Reliability Reliability 99.9999 99.999 99.99 99.9 99.0 .9 $ $$ $$$ $$$$ $$$$$ RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  44. 44. Key Takeaways – Risk Assessment • What Reliability Level Do you Really Need Based on Your Business Case? • Minimize Single Points of Failure • Concurrent Maintainability? • Fault Tolerance? • Ensure Adequacy of Operations, Maintenance and Testing Programs • How to justify the cost to upgrade from present state? RISK ASSESSMENT Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  45. 45. Key Takeaways – Reliability• Design objective – find optimum compromise between cost and reliability• Size matters – larger facilities yield lower reliability• System architecture and design implementation is more important role than equipment selection• Segregate system in independent blocks• Eliminate common source components to minimize fault propagation (i.e. LBS, hot-tie, manual bus ties)• Move single points of failures as close to the load as possible• Always maintain two independent sources of power to the critical load• Optimize the design of monitoring and controls circuits• Keep it simple/minimize human intervention/Utilize Automation RELIABILITY Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments
  46. 46. Thank you and please feelQUESTIONS? free to contact meSteven Shapiro, PE, ATDSShapiro@MorrisonHershfield.com914.420.3213http://www.linkedin.com/in/stevenshapiropeReferences:Uptime Institute White Papers:Tier Myths and MisconceptionsData Center Site Infrastructure Tier Standard: Topology
  47. 47. Building Areas/Systems Reviewed‫׀‬ General Construction‫׀‬ Electrical‫׀‬ Mechanical‫׀‬ Plumbing And Fire Protection‫׀‬ Operation and Maintenance‫׀‬ Security ‫׀‬ Load Density 48 Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments RISK ASSESSMENT
  48. 48. Site Reliability• Is Project Compatible With Zoning• Natural Environment Issues‫׀‬ Seismic Zone‫׀‬ Geo Technical Reports‫׀‬ Sub Surface Conditions‫׀‬ Tornado/hurricane Risk‫׀‬ Site Flood Potential‫׀‬ Fire Potential‫׀‬ Site Topography‫׀‬ Weather Extremes• Man‐Made Environment Issues‫׀‬ Power/Data and Communication/Water Supply/Sanitary Sewer Availability‫׀‬ ISP Connectivity to Mirror and DR Sites‫׀‬ Proximity of Hazardous Operational Facilities, i.e. Nuclear Power Plants, Military Bases,  Chemical Plants, Tank Farms, Water/Sewage Treatment Plants, Dams/Reservoirs, Gas  Stations, etc.‫׀‬ Distance to Airports & Freeways‫׀‬ Distance to Emergency Services, i.e. Fire and Police Departments, Hospital  49 Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments RISK ASSESSMENT
  49. 49. Building Areas/Systems ReviewedBuilding Utilities and Physical Issues‫ ׀‬General building systems and area characteristics‫ ׀‬Life safety and environmentalElectrical Systems‫ ׀‬Utility feeders‫ ׀‬Service entry‫ ׀‬Base building electrical distribution system including busways, step‐down  transformers, switchgear and distribution panels‫׀‬ Uninterruptible power supply (UPS) systems‫׀‬ Battery systems‫׀‬ Power Distribution System including the critical computer rooms‫׀‬ Emergency/standby generator and fuel system‫׀‬ Normal/standby power transfer switchgear‫׀‬ Grounding‫׀‬ Emergency Power Off Systems‫׀‬ Lightning protection system‫׀‬ Fire alarm and smoke detection systems 50 Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments RISK ASSESSMENT
  50. 50. Building Areas/Systems Reviewed• Mechanical Systems‫׀‬ Critical Systems Chilled Water Plant:  Chillers, pumps, piping distribution system,  controls, etc‫׀‬ Critical Systems Condenser Water System:  Cooling towers, pumps, piping, etc‫׀‬ Critical Systems Air Handling Systems‫׀‬ Critical Systems Air Distribution‫׀‬ Critical Systems Secondary Chilled Water Loop‫׀‬ Fuel Oil Systems‫׀‬ Boiler Systems‫׀‬ Compressed Air Systems• Plumbing Systems‫׀‬ Domestic Water Systems‫׀‬ Natural Gas Systems‫׀‬ Fire Suppression Systems (Water and Gaseous)• Operation and Maintenance of the Critical Support Systems‫׀‬ Maintenance procedures and programs‫׀‬ Normal operating procedures‫׀‬ Emergency operating procedures‫׀‬ Training programs and methods‫׀‬ Spare parts 51 Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments RISK ASSESSMENT
  51. 51. Building Areas/Systems Reviewed• Building Automation‫׀‬ Building Automation Systems.‫׀‬ Physical Security Systems.‫׀‬ Access control‫׀‬ Intrusion detection‫׀‬ CCTV systems‫׀‬ ID badging systems‫׀‬ Intercom systems‫׀‬ Smoke Purge Systems• Technology Systems‫׀‬ Entrance Facility Feeds.‫׀‬ Telephone Company Services.• Systems Integration:‫׀‬ The integration, compatibility and interaction of the above systems with each  other, as well as with the other building elements will be reviewed to ensure that  the systems are compatible and fully integrated. 52 Morrison Hershfield Mission Critical – Infrastructure and Risk Assessments RISK ASSESSMENT

×