The Black Swan Effect – when “what can never happen -- does.”
Presented at the Data Center World Conference, Las Vegas, April 2015
By: Rich Banta, co-owner, Lifeline Data Centers, Indianapolis
Don Byrne, PhD, President and CEO - Metrix411, Boston
Jack Pyne, Director of Training - EPI-AP, Colorado Springs
3. 3
Common Data Center Risks
• Unlicensed software
• Home-grown code in critical path
• Single carriers/ utility providers (no diversity)
• No policy/guidance for controlling BYOD
• Rogue wireless access points
• Local purchasing leading to a lack of configuration
control
• Inaccurate change management tracking
• Out-of-date documentation
• Changing compliance requirements with
rules/standards/laws
• Unnoticed facility flaws (e.g., internal wooden
frames)
• ‘Sandbox’ projects using actual client data for
testing
• No data governance software
4. 4
What does this have to do with
risk management?
Risk management faces different issues
• Avoiding, mitigating or accepting risk
• What is the risk?
• Assuring agencies, clients and stakeholders that you
have managed the risk appropriately.
• Confidence
• Communication
5. 5
Putting Risk Management in Action
Reliability-Centered Maintenance
• Developed by the FAA and the airlines in the
1960s
• Adopted by the US Military in the 1970s
• Adopted by the nuclear power industry in the
1980’s
• Disney uses it in their theme parks
6. 6
Putting Risk Management in Action - RCM
• Business-case oriented
• Formalized in SAE
JA1011
• Certification is available
from Naval Air
Command and others
• Risk assessment and
management on steroids
– all the way down to
equipment component
levels
SAE
JA1011
7. 7
Putting Risk Management in Action - RCM
FMECA: Failure Mode, Effects,
and Criticality Analysis
• Bottom-up
• Inductive analytical method
performed at the functional
or piece-part level
• Includes criticality analysis,
• Charts the probability of
failure modes against the
severity of their
consequences.
Component
Failure Potential (in
12 month period)
Criticality Factor: 1-5
(where 1 is least
critical and 5 is ultra
critical)
Priority Comments
Ventilator Fan -- unit 30-b1 99% 5 49.5
Filter Gasket -- g-205 98% 4 39.2
Needs monthly
replacement
UPS -- unit c25 60% 5 30
Generator -- unit g-5 35% 5 17.5 4 years old
HVAC Drain pump -- unit p-304 45% 3 13.5
Generator -- unit g-4 20% 5 10 2 years old
Ventilator Fan -- unit 30-b2 30% 2 6
8. 8
Putting Risk Management in Action - RCM
FMECA: Failure Mode, Effects, and Criticality Analysis
FMECAs are reviewed, refreshed, and maintained at
least on an annual basis, with the collected data
incorporated into an ongoing and dynamic failure
probability analysis model.
9. 9
Putting Risk Management in Action - RCM
When evaluating and
purchasing data center
infrastructure
equipment (generators,
UPS systems, HVAC gear,
etc.), demand copies of
the FMECAs from the
manufacturer.
10. 10
Putting Risk Management in Action - RCM
• Increasingly interface directly
with corporate/enterprise
risk managers.
• They are becoming more and
more conversant in RCM,
failure probability analysis,
• and the associated value to
the risk assessment and risk
management equation.
11. 11
Rich Banta – Co-owner Lifeline Data Centers
Indianapolis
Rich is responsible for compliance and certifications, data
center operations, information technology, and client
concierge services. He has an extensive background in
server and network management, large scale wide-area
networks, storage, business continuity, and monitoring.
He is formerly the Chief Technology Officer of a major
health care system. Rich is hands-on every day in the data
centers.
Certifications
His certifications include:
CISA – Certified Information Systems Auditor
CRISC – Certified in Risk & Information Systems
Management
CDCE – Certified Data Center Expert
CDCDP – Certified Data Center Design
Professional
CTDC - Certified TIA-942 Design Consultant
CTIA - Certified TIA-942 Auditor
CFCP – Certified FISMA Compliance Practitioner
12. 12
The Risky Data Center
Panelists
Don Byrne Jack Pyne Rich Banta
Introduction and Concepts Standards Overview Applying the Concepts