Why predictive modeling is
essential for managing a
modern computing facility
Jonathan G Koomey, Ph.D.
http://www.koomey
Research Fellow, Steyer-Taylor Center for Energy
Policy and Finance, Stanford University
Data Center Dynamics
San Francisco, CA
July 12, 2013
1
Understanding systems
2
The business problem
•  Data centers deliver computing services
that generate business value (i.e., profits)
•  Decisions about IT deployment over the
facility life almost never take business
value fully into account, because of
– siloed departments and budgets
– misplaced incentives
– imperfect foresight
3
The data center problem
•  Facilities are built using an estimate of
compute capacity that is never realized
•  IT deployment decisions after construction
are almost never according to plan
•  The result: lost capacity due to
fragmentation, resulting in stranded capex
and high cost per computation
4
Capacity fragments over time
5
The actual IT configuration will differ from the design assumptions. These differences will fragment
space, power, cooling & networking resources, and ultimately, limit data center capacity.
Source: Future Facilities
My focus today
•  What is a model?
– Uses of models
– Making a model
•  Why predictive modeling is essential for
avoiding stranded capex in data centers
•  Case study: Predictive modeling for
Equinix
6
“An explicit model is a laboratory
for the imagination.”
–Anthony Starfield et al., How to Model It.
7
The Bay Model, Sausalito, CA
http://www.spn.usace.army.mil/Missions/Recreation/BayModelVisitorCenter.aspx
8
Everyone uses models, most badly
•  Usually informal models
•  Intuitive but not necessarily accurate
– Ignoring physics and interdependencies
– Ignoring effects of actions on lost capacity and
business value
•  Need to be more formal!
9
Uses of formal models
•  Organize
– thinking
– data
– assumptions
– terminology
– communication between teams
•  Learn about complex systems
– Intuition usually isn’t enough!
•  Test alternative choices to aid planning
10
Making a model
•  Understand first principles
– Key drivers
– Functional relationships
•  Formalize using equations or physical
structures
•  Test against reality
– measure and calibrate
•  Then (and only then) use model to test
alternatives!
11
Accurate calibration requires…
•  Real-time measurements
•  Comparison of model results to
measurement
•  Understanding of physical reasons for
differences
•  Adjustment of model parameters,
accounting for physical reality (can’t just
hard wire results!)
12
Real measurements needed!
13
Data centers are complex
systems
≠
14
http://www.fatcow.com/data-center-photos http://www.dell.com
Same equipment, different locations
15
Source: Future Facilities
Key data center issues
•  Constraints
– Reliability
– Power
– Cooling
– Space
– Networking
•  Interdependencies between
– Constraints
– Business objectives
16
A complete model of a data center
should include…
•  Characteristics of equipment
– Physical dimensions and location
– Operating characteristics (e.g., utilization)
– Power use/efficiency curves
– Equipment and building level air flows
•  Characteristics of the physical space
– #, type, capacity, and location of vents/fans
– Obstructions (e.g., stray boxes and cabling)
– Modifications in the envelope
17
An accurate model also requires
•  Real-time measurement (i.e., DCIM) of
– Temperature
– Air flows
– Power use
•  Periodic calibration to reflect changed
conditions over time
•  Performance and financial metrics to judge
progress
18
and all of these things need to
be tracked in real time for the
life of the facility!
19
Equinix case study
20
Characteristics of Equinix facility
•  Case study, Spring 2013
•  Colocation facility in the SF Bay Area
•  Floor 1, modeled white space: 8,750 sq ft
•  Total facility floor space: 42,000 sq ft.
•  Details on infrastructure
– 2 ft raised floor airflow delivery
– 42” false ceiling return plenum.
– 12 AHU’s N+2 redundancy
21
Recapturing lost capacity
22
Source: Future Facilities
Predictive IT deployment
23
•  How can Equinix
identify void
capacity for
clients?
•  Void capacity can
be reclaimed!
•  Simulating IT
changes prior to
installation will:
–  Increase thermal
resilience
–  Enable additional
cabinet power to
be utilized
Managing
IT Deployment
Projected
Configuration
From Current
Source: Future Facilities
Recapture lost capacity
24
Conclusions
•  Data centers are complex systems, changing
constantly over time
–  Like a game of Tetris
–  Fragmentation leads to lost capacity
•  Monitoring and measurement are not
enough!
•  Much lost capacity can be reclaimed using
predictive modeling and state of the art tools,
with support of DCIM measurements
•  Don’t turn knobs without knowing the likely
results!
25
References
•  Koomey, Jonathan, Kenneth G. Brill, W. Pitt Turner, John R. Stanley, and Bruce Taylor.
2007. A simple model for determining true total cost of ownership for data centers. Santa
Fe, NM: The Uptime Institute. September. <http://www.uptimeinstitute.org/>
•  Koomey, Jonathan. 2008. "Worldwide electricity used in data centers." Environmental
Research Letters. vol. 3, no. 034008. September 23. <http://stacks.iop.org/
1748-9326/3/034008>.
•  Koomey, Jonathan. 2008. Turning Numbers into Knowledge: Mastering the Art of Problem
Solving. 2nd ed. Oakland, CA: Analytics Press. [http://www.analyticspress.com]
•  Koomey, Jonathan. 2011. Growth in data center electricity use 2005 to 2010. Oakland, CA:
Analytics Press. August 1. <http://www.analyticspress.com/datacenters.html>
•  Stanley, John, and Jonathan Koomey. 2009. The Science of Measurement: Improving Data
Center Performance with Continuous Monitoring and Measurement of Site Infrastructure.
Oakland, CA: Analytics Press. October 23. <http://www.analyticspress.com/
scienceofmeasurement.html>
•  Starfield, Anthony M., Karl A. Smith, and Andrew L. Bleloch. 1990. How to Model It:
Problem Solving for the Computer Age. New York, NY: McGraw-Hill, Inc.
26

Why predictive modeling is essential for managing a modern computing facility

  • 1.
    Why predictive modelingis essential for managing a modern computing facility Jonathan G Koomey, Ph.D. http://www.koomey Research Fellow, Steyer-Taylor Center for Energy Policy and Finance, Stanford University Data Center Dynamics San Francisco, CA July 12, 2013 1
  • 2.
  • 3.
    The business problem • Data centers deliver computing services that generate business value (i.e., profits) •  Decisions about IT deployment over the facility life almost never take business value fully into account, because of – siloed departments and budgets – misplaced incentives – imperfect foresight 3
  • 4.
    The data centerproblem •  Facilities are built using an estimate of compute capacity that is never realized •  IT deployment decisions after construction are almost never according to plan •  The result: lost capacity due to fragmentation, resulting in stranded capex and high cost per computation 4
  • 5.
    Capacity fragments overtime 5 The actual IT configuration will differ from the design assumptions. These differences will fragment space, power, cooling & networking resources, and ultimately, limit data center capacity. Source: Future Facilities
  • 6.
    My focus today • What is a model? – Uses of models – Making a model •  Why predictive modeling is essential for avoiding stranded capex in data centers •  Case study: Predictive modeling for Equinix 6
  • 7.
    “An explicit modelis a laboratory for the imagination.” –Anthony Starfield et al., How to Model It. 7
  • 8.
    The Bay Model,Sausalito, CA http://www.spn.usace.army.mil/Missions/Recreation/BayModelVisitorCenter.aspx 8
  • 9.
    Everyone uses models,most badly •  Usually informal models •  Intuitive but not necessarily accurate – Ignoring physics and interdependencies – Ignoring effects of actions on lost capacity and business value •  Need to be more formal! 9
  • 10.
    Uses of formalmodels •  Organize – thinking – data – assumptions – terminology – communication between teams •  Learn about complex systems – Intuition usually isn’t enough! •  Test alternative choices to aid planning 10
  • 11.
    Making a model • Understand first principles – Key drivers – Functional relationships •  Formalize using equations or physical structures •  Test against reality – measure and calibrate •  Then (and only then) use model to test alternatives! 11
  • 12.
    Accurate calibration requires… • Real-time measurements •  Comparison of model results to measurement •  Understanding of physical reasons for differences •  Adjustment of model parameters, accounting for physical reality (can’t just hard wire results!) 12
  • 13.
  • 14.
    Data centers arecomplex systems ≠ 14 http://www.fatcow.com/data-center-photos http://www.dell.com
  • 15.
    Same equipment, differentlocations 15 Source: Future Facilities
  • 16.
    Key data centerissues •  Constraints – Reliability – Power – Cooling – Space – Networking •  Interdependencies between – Constraints – Business objectives 16
  • 17.
    A complete modelof a data center should include… •  Characteristics of equipment – Physical dimensions and location – Operating characteristics (e.g., utilization) – Power use/efficiency curves – Equipment and building level air flows •  Characteristics of the physical space – #, type, capacity, and location of vents/fans – Obstructions (e.g., stray boxes and cabling) – Modifications in the envelope 17
  • 18.
    An accurate modelalso requires •  Real-time measurement (i.e., DCIM) of – Temperature – Air flows – Power use •  Periodic calibration to reflect changed conditions over time •  Performance and financial metrics to judge progress 18
  • 19.
    and all ofthese things need to be tracked in real time for the life of the facility! 19
  • 20.
  • 21.
    Characteristics of Equinixfacility •  Case study, Spring 2013 •  Colocation facility in the SF Bay Area •  Floor 1, modeled white space: 8,750 sq ft •  Total facility floor space: 42,000 sq ft. •  Details on infrastructure – 2 ft raised floor airflow delivery – 42” false ceiling return plenum. – 12 AHU’s N+2 redundancy 21
  • 22.
  • 23.
    Predictive IT deployment 23 • How can Equinix identify void capacity for clients? •  Void capacity can be reclaimed! •  Simulating IT changes prior to installation will: –  Increase thermal resilience –  Enable additional cabinet power to be utilized Managing IT Deployment Projected Configuration From Current Source: Future Facilities
  • 24.
  • 25.
    Conclusions •  Data centersare complex systems, changing constantly over time –  Like a game of Tetris –  Fragmentation leads to lost capacity •  Monitoring and measurement are not enough! •  Much lost capacity can be reclaimed using predictive modeling and state of the art tools, with support of DCIM measurements •  Don’t turn knobs without knowing the likely results! 25
  • 26.
    References •  Koomey, Jonathan,Kenneth G. Brill, W. Pitt Turner, John R. Stanley, and Bruce Taylor. 2007. A simple model for determining true total cost of ownership for data centers. Santa Fe, NM: The Uptime Institute. September. <http://www.uptimeinstitute.org/> •  Koomey, Jonathan. 2008. "Worldwide electricity used in data centers." Environmental Research Letters. vol. 3, no. 034008. September 23. <http://stacks.iop.org/ 1748-9326/3/034008>. •  Koomey, Jonathan. 2008. Turning Numbers into Knowledge: Mastering the Art of Problem Solving. 2nd ed. Oakland, CA: Analytics Press. [http://www.analyticspress.com] •  Koomey, Jonathan. 2011. Growth in data center electricity use 2005 to 2010. Oakland, CA: Analytics Press. August 1. <http://www.analyticspress.com/datacenters.html> •  Stanley, John, and Jonathan Koomey. 2009. The Science of Measurement: Improving Data Center Performance with Continuous Monitoring and Measurement of Site Infrastructure. Oakland, CA: Analytics Press. October 23. <http://www.analyticspress.com/ scienceofmeasurement.html> •  Starfield, Anthony M., Karl A. Smith, and Andrew L. Bleloch. 1990. How to Model It: Problem Solving for the Computer Age. New York, NY: McGraw-Hill, Inc. 26