• Save
Cooling & Power Issues
Upcoming SlideShare
Loading in...5
×
 

Cooling & Power Issues

on

  • 3,838 views

Data Center Forum: Cooling & Power Issues - City Beach, Fremont - October 12, 2006

Data Center Forum: Cooling & Power Issues - City Beach, Fremont - October 12, 2006

Statistics

Views

Total Views
3,838
Views on SlideShare
3,832
Embed Views
6

Actions

Likes
9
Downloads
0
Comments
0

2 Embeds 6

http://www.slideshare.net 5
http://192.168.10.100 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cooling & Power Issues Cooling & Power Issues Presentation Transcript

  • Data Center Forum: Power & Cooling Issues October 12, 2006 Presenters: Dr. Robert Sullivan, "Dr. Bob,” Triton Technology Systems Fritz Menchinger, NER
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expectations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • MOORE’S LAW
    • “The number of transistors on a chip doubles every 24 months and the Performance doubles every 18 months.”
    • Intel cofounder Gordon Moore (1965)
  • GROWTH IN TRANSISTORS PER DIE © 2003 Intel Corporation
  • 2005 – 2010 PROJECTIONS PRODUCT HEAT DENSITY TREND CHART
  • INTEL PROJECTED ACTUAL POWER CONSUMPTION FOR 42-1 RU SERVERS Actual product power consumption has lagged behind these projections by about 2 years * Product footprint 3,360 W/ft 2* 20.2 kW 480 W/RU Q3, 2004 2,900 W/ft 2* 17.6 kW 420 W/RU Q3, 2003 2,478 W/ft 2* 14.8 kW 354 W/RU Q3, 2002 1,890 W/ft 2* 11.3 kW 270 W/RU Q3, 2001
  • 2004 HIGH-END PRODUCTS 2004 Trend chart mid-point projection was 1,800 W/ft 2 (Maximum configurations & options) * Based on product footprint 750 W/ft 2* 54.0 kW 32”x324” EMC DMX3 1,100 W/ft 2* 21.0 kW 32”x87” IBM DASD 1,300 W/ft 2* 10.0 kW 28”x40” HP Superdome 1,800 W/ft 2* 16.0 kW 36”x36” IBM Z-Series 1,700 W/ft 2* 24.0 kW 36”x56” Sun F15K
  • 2004 BLADE AND 1U SERVERS Trend chart mid-point projection for 2004 is 3,000 W/ft 2 (Maximum configurations & options) * Product footprint W/ft2 based on actual cabinet size 3,000 W/ft 2 * 18.0 kW HP ProLiant Ble 2,000 W/ft 2 * 8.0 kW Electric Oven 2,200 W/ft 2 * 13.3 kW RLX ServerBlade 3000i 2,300 W/ft 2 * 14.0 kW Sun Sunfire 3,000 W/ft 2 * 18.0 kW IBM eServer Blade Center 4,000 W/ft 2 * 24.0 kW Dell PowerEdge 1850MC
  • IMPLICATIONS OF THE CMOS POWER CRISIS
    • Cost per processor is decreasing at 29% per year
    • Constant dollars spent on high performance IT hardware three years from now will buy:
      • 2.7 times more processors
      • 12 times more processing power in the same or less floor space
      • 3.3 times UPS power consumption increase
    • Site power consumption will increase by at least 2x the UPS power consumption increase
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expectations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • TEMPERATURE RISE UPON LOSS OF AIR COOLING
    • Time to critical temperature rise
      • 40 W/ft² - 10 minutes
      • 100 W/ft² - 3 to 5 minutes
      • 200 W/ft² - 1to 3 minutes
      • 300 W/ft² - Less than a minute
      • 300 W/ft²?
        • 4.5 kW in 15 ft²
        • Single cabinet in typical Cold Aisle / Hot Aisle arrangement
  • HIGH DENSITIES REQUIRE A COOLING PARADIGM SHIFT
    • A paradigm shift occurs when a previously loosely coupled site infrastructure system becomes tightly coupled
    • Small changes have a big impact (often with unexpected results)
    • Reliability of individual components, and fault tolerance if they malfunction, becomes critical
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expectations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • MISMATCHED EXPECTATIONS
    • “ Does it NEVER FAIL?”
    • Versus
    • “ Does it WORK?”
    • (Always available versus normally available)
    Linguistic Choices Are Critical
  • MISMATCHED EXPECTATIONS
    • When expectations do not match reality
      • IT demand = “24 by Forever” availability
      • Infrastructure = Tier I or Tier II facility
    Failure to Define Expectations
  • FUNCTIONALITY DEFINITIONS Multiple active paths, redundant Tier IV: Single active path, redundant Tier III: Single path, redundant components Tier II: Single path, no redundancy Tier I:
  • SINGLE POWER PATH
    • SINGLE POINTS-OF-FAILURE
    • UPS system level failure
    • Major circuit breakers (2-20)
    • Minor circuit breakers (20-500)
    • Plugs and receptacles (21-505)
    • Electrical connections (258-6180)
    • Human error
    • False EPO
    Utility Battery Generator THREE POWER PATHS ONE POWER PATH COMPUTER HARDWARE
  • DUAL POWER PATH
    • SINGLE POINTS-OF-FAILURE
            • UPS system level failure
            • Major circuit breakers (2-20)
            • Minor circuit breakers (20-500)
    Utility Utility Battery Generator THREE POWER PATHS TWO POWER PATHS COMPUTER HARDWARE Generator Battery 2 3 1 45
  • MISMATCHED EXPECTATIONS
    • “Match the required level of site infrastructure capacity, functionality, master planning, organizational charter and doctrine, staffing, processes, and training to availability expectations.”
    The Only Way to Assure Success
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expectations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • Cooling concerns > No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow Need for supplemental cooling
  • POOR MASTER PLANNING
    • Circle the wagons approach is common
    • Not good
    Cooling Unit Layout
  • POOR MASTER PLANNING
    • Random placement
    • Hot spot solution to “circle the wagons”
    • Worse
    Cooling Unit Layout
  • RAISED-FLOOR UTILIZATION
    • All aisles have elevated “mixed” temperature (starved supply airflow compounds problem)
    • Fails to deliver predictable air intake temperatures
    • Reduces return air temperature which reduces cooling unit capacity and removes moisture
    Traditional (Legacy) Layout
  • MASTER PLANNING
    • Static regain improves usable cooling unit redundancy
    • Maximizes static pressure & CFM per perforated tile
    • Minimizes effect of high discharge in velocity
    Achieves High IT Yield by Maximizing Cooling Delivery
  • COMPUTER ROOM LAYOUT
  • IT YIELD
    • Cooling delivery is typically the constraining factor
    • Manage cooling by zones of the overall room
      • One to four building bays max
    • Monitor and manage IT Yield performance metrics
      • Racks/thermal conduction ft 2
      • Rack unit positions
      • PDU power
      • Breaker positions
      • Redundant sensible cooling capacity
      • Floor loading
    Maximize Site Investment Utilization
  • Cooling concerns No computer room master plan > Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow Need for supplemental cooling
  • FAILURE TO MEASURE AND MONITOR
    • If you do not measure and record
    • You can not monitor
    • If you do not monitor you can not control
    • Without controls Chaos reigns
  • MEASURING MONITORING AND CONTROL
    • Cooling delivery is typically the constraining factor
    • Manage cooling by zones of the overall room
      • One to four building bays max
    • Monitor and manage IT Yield performance metrics
      • Racks/thermal conduction ft 2
      • Rack unit positions
      • PDU power
      • Breaker positions
      • Communication ports
      • Redundant sensible cooling capacity
      • Floor loading
  • MASTER PLAN MONITORING WORKSHEET
  • Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices > Mechanical incapacity Bypass airflow Need for supplemental cooling
  • ORIGINS OF THERMAL INCAPACITY Design or Equipment Related
    • A gross “ton” is not a “sensible” ton
    • DX system refrigerant being partially charged
    • “ Dueling” dehumidification/humidification
    • Insufficient airflow across cooling coils
    • Chilled water temperature too low
    • Computer room return temperature too low
    • Too much cold air bypass through unmanaged openings (cable cutouts and penetrations to adjacent spaces)
  • ORIGINS OF THERMAL INCAPACITY Human Factors
    • Lack of psychrometric chart knowledge
    • Inappropriate computer room floor plan and equipment layouts
    • Pre-cooling of returning hot air by incorrect perforated floor tile placement and unsealed cable openings
    • Control sensors and instruments not calibrated
    • Engineering consultants who do not yet understand the unique cooling dynamics of data centers and underfloor air distribution
  • CONSEQUENCES OF THERMAL INCAPACITY The following results are based on detailed measurements in 19 computer rooms totaling 204,400 ft 2
    • 10% of the racks had “hot spots” at the intake air exceeding 77°F/40% Rh
    • This occurred despite having 2.6 times more cooling running than was required by the heat load
    • Rooms with the greatest excess of cooling capacity had the worst % of hot spots
    • 10% of the cooling units had failed
  • Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity > Bypass airflow Need for supplemental cooling
  • BYPASS AIRFLOW DEFINITION
    • Escaping through cable cutouts and holes under cabinets
    • Escaping through misplaced perforated tiles
    • Escaping through holes in computer room perimeter walls, ceiling, or floor
    Conditioned air is not getting to the air intakes of computer equipment
    • Cold air escapes through cable cutouts
    • Escaping cold air reduces static pressure resulting in insufficient cold aisle airflow
    • Result is vertical and zone hot spots in high heat load areas
    COMPUTER ROOM LAYOUT OPTIONS EFFECT OF BYPASS AIRFLOW
  • RAISED-FLOOR UTILIZATION TRADITIONAL LAYOUT
    • All aisles have elevated “mixed” temperature (starved supply airflow compounds problem)
    • Fails to deliver predictable air intake temperatures
    • Reduces return air temperature which reduces cooling unit capacity
  • TYPICAL BYPASS AIRFLOW CONDITION Reduces kW capacity per rack that can be effectively and predictably cooled.
  • TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) This unnecessarily large raised-floor opening should be closed. The edges of the cutout must be dressed according to NFPA code.
  • TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) Unnecessarily large cable cutout under a server rack
  • TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) Both a bypass airflow problem and a safety hazard.
  • BYPASS AIRFLOW - IS IT A PROBLEM?
    • Based on detailed measurements in 19 computer rooms totaling 204,400 ft 2
    • Despite 2.6 times more cooling running than was required by the heat load, 10% of racks had air intake temperatures exceeding ASHRAE maximum reliability guidelines (rooms with greatest excess cooling capacity running had worst hot spots)
    60% of available cold air is short cycling back to cooling units through perforated tiles in the hot aisle and unsealed cable openings
  • BYPASS AIRFLOW REDUCES RELIABILITY, STABILITY, AND USABLE COOLING CAPACITY
    • Reduces underfloor static pressure
    • Reduces volume of conditioned air coming into the cold aisle
    • Exacerbates problems with underfloor obstructions
    • Creates environment where recirculation of hot exhaust air across the top of racks will occur
    Reduces kW capacity per rack that can be effectively and predictably cooled
  • INTERNAL RECIRCULATION CAN REDUCE RELIABILITY
    • Utilize blanking plates within cabinets
    Internal recirculation is also a problem
  • BYPASS AIRFLOW HOW IS IT FIXED? PERIMETER HOLES
    • Use permanently installed firestop materials for conduits, pipes, construction holes, etc., through walls
    • Removable fire pillows for floor or wall cable pass throughs
    Seal all the holes in the computer room perimeter
  • BEST PRACTICE PERIMETER PENETRATIONS ARE SEALED Good example of fire stopping through a sidewall.
  • BEST PRACTICE PERIMETER PENETRATIONS ARE SEALED (CONTINUED) Excellent fire stopping practices are evident throughout this site.
  • FLOOR OPENINGS ACCEPTABLE CABLE CUTOUT SOLUTIONS
    • Fire pillows
    • Foam sheeting
    • Brush assemblies
    Seal all raised-floor cable openings plus openings around PDUs and cooling units
  • FLOOR OPENINGS FIRE PILLOW SOLUTION
    • Difficult to achieve an effective level of sealing
    • Often falls to subfloor or is kicked out of way
    • Regular policing is required
    • No static dissipative property for electrostatic char
  • FLOOR OPENINGS FIRE PILLOW EXAMPLES This is one way to prevent air loss. Additional refinement is needed.
  • FLOOR OPENINGS FOAM SHEETING SOLUTION
    • Very labor intensive to achieve good sealing efficiency
    • Every cabling change requires re-cutting foam
    • Often tears, pulls out, or falls to subfloor when cable head is pulled through
    • Requires regular policing
    • Special foam material is required to achieve static dissipation
  • FLOOR OPENINGS FOAM SEALING EXAMPLES Plugging cable opening is a good practice, a better choice of materials would be more appropriate.
  • FLOOR OPENINGS PROBLEMS WITH FOAM SEALING (CONTINUED)
    • The foam in this picture was torn and hanging by a thread.
    • It was pieced back together for the picture.
    • Tearing occurs when the cable head passes through.
    • Foam is typically deformed or missing in 50% to 75% of openings after six months.
  • FLOOR OPENINGS PROBLEMS WITH FOAM SEALING (CONTINUED) Foam sealing has not been reinstalled after re-cabling. Resulting opening allows significant air leakage.
  • FLOOR OPENINGS BRUSH SEALING ASSEMBLIES
    • Most expensive initially, but least life cycle cost because recurring policing labor is not required
    • High sealing effectiveness both initially and after multiple recablings (100% sealing effectiveness in undisturbed opening area)
    • Doesn’t require training or policing
    • Can be static dissipative
  • FLOOR OPENINGS BRUSH SEALING FOR NEW OPENINGS
    • Brush grommet for sealing new holes in floor tiles.
  • FLOOR OPENINGS BRUSH SEALING FOR NEW OPENINGS
    • Brush grommet for sealing new holes in floor tiles.
  • FLOOR OPENINGS BRUSH SEALING FOR EXISTING OPENINGS
    • Separable brush grommet for sealing existing openings in floor tiles.
  • INTERNAL BYPASS AIRFLOW - HOW IS IT FIXED? Install internal blanking plates within cabinets to prevent open RU openings from recirculating hot air exhaust
  • INTERNAL BYPASS AIRFLOW BLANKING PLATE INSTALLATION EXAMPLE Proper use of blanking or filler plates exhibited.
  • Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow > Need for supplemental cooling
  • NEED FOR SUPPLEMENTAL COOLING When the normal cooling system will not handle the load, especially in high density spot situations, supplemental cooling is necessary
    • Dedicated to one cabinet
    • Dedicated to an area of the room
  • SUPPLEMENTAL COOLING OPTIONS
    • In line cooling – horizontal airflow
      • Hot exhaust air drawn through a fan coil unit and cold air blown into the Cold Aisle. Usually a chilled water installation
    • Overhead cooling
      • Fan coil unit sits on top of cabinet or is hung from ceiling.
      • Hot exhaust air drawn through a fan coil unit, using a refrigerant not chilled water, and blown into the Cold Aisle.
    • Back cover cooling
      • Fan coil system replaces the back cover of a cabinet
      • Usually a chilled water system
      • Heat is neutralized before being blown into Hot Aisle
    • Dedicated cabinet
      • Air is recirculated within the cabinet
      • Contains its own fans and cooling coil
      • All have redundant fan systems
      • Only one has redundant cooling capability
  • MORE INFORMATION
    • koldlok.com – Koldlok Products
    • upsite.com – White Papers
    • How to Successfully Cool High-Density IT Hardware Seminar
      • November 1 – 3, Miami, FL
      • November 27 – 29, Santa Fe, NM
      • Check upsite.com for details
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expecations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • HISTORY OF INNOVATION 2006-Infrastructure Consulting, Design, Build, High Density Power and Cooling 1985 MF Tape Racks 1989 Autotrieve 1 st automated tape racks, first S.A.M 2006 IT Security Audit, Compliance Services 1992 NER’s first custom server cabinets 1998 Began Distributing Cybex 2000 Began Distributing NetBotz 2001 Began Distributing ServerTech 2004 Ultimate Core/ Largest Avocent Strategic Dist. Partner 2002 Introduced R3 1991 first high-density tape racks 2005 Launch of Services Business
  • INNOVATION ROADMAP Solutions and Services for 2006 and beyond 2005 Begin Factory Integration 2005 On-Site integration 2006 Data Center Health Check CFD Modeling and Adaptivcool cooling solutions Build-out, Build-New Assessment Enhanced Centralized Management Solutions and Training Services Project Management and Implementation 2006 Asset& Inventory Service Data Center Construction Enhanced Facility Monitoring
  • THE LATEST FROM AFCOM SURVEYS
    • “ More than two thirds (66%) of 178 Afcom Data Center professionals surveyed anticipate they'll have to expand their data centers or turn to outsourcing to meet demands in the next decade.”
    • Source =  Information Week – August 2006
  • AFCOM SURVEY SAYS
    • “by 2010, more than half (50%) of all data centers will have to relocate or outsource some applications”.
  • CURRENT STATE OF DATA CENTER COOLING
    • 19 Data Centers surveyed
    • Average had 2.6X more cooling than IT load
    • Still 10% of racks were over 77F
    • 72% of cooling air bypasses racks or mixes before racks
    AFCOM 2006 Keynote Speech
  • ARE YOU PREPARED??
    • “ AFCOM's Data Center Institute, a think tank dedicated to improving data center management and operations predicts that over the next five years, power failures and limits on power availability will halt data center operations at more than 90% of companies”.
    AFCOM 2006 Keynote Speech
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expecations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • CIO’S PARADOX TODAY (SAME MESSAGE FOR 3 YEARS) The CIO’s Paradox in action: What the CIO hears from other business leaders. “ You’re a service. Why can’t you respond to our division better? What are your people doing?” & “ IT is strategic. How do you, the CIO, set investment priorities? “ I have no trust in IT’s ability to deliver measurable value” & “ We need new, better solutions. “ Reduce your budget!” & Keep our systems running 24x7!”
  • DATA CENTER FAULT TOLERANCE Are these photos from your data center?
  • THERMAL MANAGEMENT- IT’S A REALITY (WE HAVE BEEN USING THIS SLIDE FOR YEARS)
    • New Servers
      • One cabinet = 27,304 BTU/hr!!!!
        • One ton of cooling = 12,000 BTU/hr
        • A fully configured cabinet can produce 35,000+ watts.*
          • *IBM Blade Center H w/ 4 2900 watt power supplies 3/cabinet
    • Projected Thermal Loads Servers and DASD
    Power = Heat! Servers & DASD 100W/ft 2 150W/ft 2 200W/ft 2 2000 2002 2005
  • CAN YOU GET 5 TONS OF AIR THROUGH ONE PERFORATED TILE?? HOW ABOUT 9?
  • HOW WILL YOU GET 6KW OR MORE TO A SINGLE CABINET??
    • Application Based
    • Circuits
      • 5.9 kW
        • (4) 20 amp, 110v circuits
          • (8) redundant
        • (3) 30 amp, 110v circuits
          • (6) redundant
        • (2) 30 amp, 208v circuits
          • (4) redundant
        • (1) 60 amp, 208v circuit
          • (2) redundant
  • HIGH DENSITY Leads to Complexity (we have been talking about this for years)
    • Manufacturers have their own management platform
    • Higher powered systems require a higher level of management
    • Physical requirements change
    • Supporting Infrastructure is more complex (weight, power, heat, cables)
  • WHAT CAN YOU DO?
    • Implement best practices where you can
    • Focus on high value/high return
    • Choose solutions that allow for standardization & automation
    • Rely on Trusted Advisors to fill the gaps
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expecations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • BACKGROUND –SOUND FAMILIAR??
    • Aging data center (7 years)
    • Blades and other High Density implemented creating power and heat problems
    • Consolidating via VM Ware
    • Management demanding higher reliability
    • Management will not move the data center
  • DATA CENTER HEALTH CHECK-UP
    • Thermal – Airflow delivery, perforated tile placement
    • Cable Management – In-rack cabling, overhead ladder racking, sub-floor cable raceway
    • Standardization - Cabinet placement & orientation, data center layout, cabinet types
    • Remote Access & Monitoring – KVM types, environmental monitoring, server room access
    • Fault Tolerance – Redundant power, CRAC failure
    What areas did we investigate?
  • ISSUE #1- THERMALS/HEAT
    • Problem:
      • Heat build-up in server room
    • Leading Causes:
      • Sub-floor obstructions – power & data cabling
      • 12” raised floor is insufficient for size of server room
      • Perforated tiles incorrectly placed
      • Cabinet placement and orientation do not permit hot aisle return paths – cabinets intake hot air
  • HEAT BUILDUP IN SERVER ROOM
  • THERMAL MANAGEMENT RECOMMENDATIONS
    • Clean-up under floor power and data cabling.
    • Move towards proper cable management– “Data Above, Power Below”
    • Re-work perforated floor tile positions
    • Block cable cutouts with brushed grommets
    • Implement hot-aisle/cold-aisle design
    • Install Adaptive Cool airflow delivery system to maintain temperatures in and around high-density cabinets
  • RECOMMENDED PERFORATED TILE POSITIONING WITH EXISTING LAYOUT Move this row!
  • RECOMMENDED HOT-AISLE COLD-AISLE LAYOUT Note the placement of the CRAC’s to the hot aisle. Note that we shut off one of the CRAC’s!
  • THERMAL PATTERNS OF RECOMMENDED HOT-AISLE/COLD-AISLE LAYOUT Before After v4
  • CABLE MANAGEMENT ISSUE #2
    • Problem:
      • Power & Data Cabling causing clutter under raised floor and inside cabinets
    • Leading Causes:
      • No standard for in-rack cable management causing waterfall cabling and heat buildup
      • No Overhead ladder racking
      • No cable raceway under the floor
      • Server depth reaching cabinet capacity
  • CABLE MANAGEMENT RECOMMENDATIONS
    • Segregate power and data via cable management
    • Eliminate swing arms
    • Install patch panels to document rack density and per-port usage
    • Use ladder racking
    • Standardize placement of in-rack power supplies to consolidate and secure power cabling
    Before After
  • STANDARDIZATION ISSUE #3
    • Problem:
      • Lack of standardization across server room
    • Leading Causes:
      • Cabinets vary in size, shape, and vendor type
      • Row lengths vary, gaps in rows due to tables
      • Rack space unused
      • No cable management standards
      • In-rack power utilizing inefficient voltage/amperage
  • STANDARDIZATION RECOMMENDATIONS
    • Set standards for cabinets
      • Invest in a standard cabinet structure, power, cable mgmt
        • High voltage/high amperage power will increase efficiency/decrease usage
      • Separate power and data
      • Consolidate servers into high density cabinets
    • Arrange like cabinets in rows
      • Remove tables from rows
      • Arrange rows in equal lengths to improve airflow patterns
      • Arrange layout into hot-aisle/cold-aisle design
  • REMOTE ACCESS & MONITORING ISSUE #4
    • Problem
      • Constant foot traffic in server room due to lack of remote access Problem-Current KVM technology using legacy analog local-access methods
    • Leading Causes:
      • Current remote access methods costly and using up multiple network ports
      • Environmental monitoring not adequate or providing remote notification
      • Central alarm management located off-site
      • Leak detection system inadequate for lakeside location
  • REMOTE ACCESS & MONITORING RECOMMENDATIONS
    • Invest in modern KVM technologies
      • Reduce cabling and port usage
      • Improve access to servers from NOC and remote sites
    • Install and maintain centralized remote access & monitoring solution in-house
      • Reduce foot traffic in and out of the server room
      • Improve security of devices and data
    • Install improved environmental monitoring system
      • Configure to monitor CRAC, UPS, Generator
      • Configure for remote notification services according to user-specified thresholds
      • Install adequate leak detection solution
  • FAULT TOLERANCE ISSUES #5
    • Problem
    • Lack of Redundant systems to prevent unscheduled downtime
      • Cabinets utilizing non-redundant power sources
      • Cable clutter putting devices at risk of being disconnected
      • Three-phase power delivery system out of balance
      • CRAC placement inhibits redundancy and recovery from failures (page 16)
      • UPS and Generator reaching maximum capacity, no room for growth
      • Risk of lakeside flood requires more adequate leak detection system and disaster notification services
  • WORST CASE SCENARIO – 15-TON CRAC FAILURE Failure in Current Layout Failure in Hot/Cold Layout
  • FAULT TOLERANCE RECOMMENDATIONS
    • Plan in-rack power around fully configured cabinet
      • Proper planning will result in a balance three-phase power delivery system
      • Proper cable management will decrease risk of disconnection of power/data cabling
    • Consistent maintenance of aging CRAC units will extend lifetime
      • Improved airflow will help to improve efficiency of units
      • Cooling redundancy during CRAC failures still not available
    • Install an improved leaked detection system
      • Monitor perimeter of data center as well as specified points such as chilled water piping, overhead plumbing, and CRAC units
  • COST BENEFIT ANALYSIS
    • Category 1
    • Low Benefit / Low Return
    • Low Cost / Minimal Effort
    • Category 2
    • High Benefit / High Return
    • Low Cost / Minimal Effort
    • Category 3
    • Minimal Benefit / Low Return
    • High Cost / Maximum Effort
    • Category 4
    • High Benefit / High Return
    • High Cost / Maximum Effort
    Benefit / Return on Investment Cost / Effort
  • AIRFLOW/CABLE MGMT COST/BENEFIT ANALYSIS MATRIX
    • Category 1
    • Manage cabling inside racks
    • Position perforated tiles in recommended layout
    • Eliminate vendor cable mgmt
    • Category 2
    • Consolidate/rerun cabling
    • Invest in proper perforated tiles and layout
    • Implement Hot/Cold layout
    • Category 3
    • Add/Replace CRAC units
    • Add/Replace UPS/Generator
    • Standardize racks/power
    • Category 4
    • Standardize racks/power in conjunction with Hot/Cold layout
    • Install dynamic airflow sys.
    Benefit / Return on Investment Cost / Effort
  • REMOTE A&M/FAULT TOLERANCE COST/BENEFIT ANALYSIS MATRIX
    • Category 1
    • Manage cabling to eliminate risk of disconnections
    • Improve remote access via current methods
    • Category 2
    • Install modern KVM/Remote Access & Monitoring
    • Maintain/Upgrade CRACs
    • Improve leak detection
    • Category 3
    • Add/Replace CRAC/Gen.
    • Standardize racks/power/data/access
    • Improve leak detection
    • Category 4
    • Relocate Data Center
    • Create Cooling/Disaster Recovery Plan from scratch
    • Plan power/data/access
    Benefit / Return on Investment Cost / Effort
  • IN SUMMARY
    • Recommendations will help to calm the issues
      • Resolutions may not provide room for growth/fault tolerance without large cost associated with add/replacement of data center devices (CRAC, UPS, Generator)
    • Assistance in implementation of Hot-aisle/Cold-aisle setup and rack/power standardization will help to improve issues
      • May cause downtime
      • Requires effort towards rearrangement
    • High cost/minimal effort resolutions may aid in resolving current issues
      • Issues may arise again due to normal growth during next several years
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expecations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • BACKGROUND
    • High Visibility name brand with a strong competitor
    • New Data Center Build (renovation)
    • Very old multi-use building near city center
    • Short time frame from build–out completion to lights-on
    • Well known consulting firm doing project management
  • ISSUES
    • Issues
    • 30 days to implement new data center (post construction)
    • Maintain Enterprise Standards- while meeting varying requirements of New Data Center Build
    • Provide best practices at the rack level
    • Maximize efficiencies of standardization and automation
      • Cable Management - Cabinet placement & orientation,
      • Remote Access & Monitoring – KVM types,
      • Fault Tolerance – Redundant power,
  • VARYING POWER REQUIREMENTS BY ROW WAS A PERCEIVED ISSUE FOR THE CUSTOMER 15kW 12kW 9kW 5kW
  • TIME WAS NOT ON THEIR SIDE
    • Build-out had a 90 day window for completion
    • Vendor Management was an issue for the project manager
    • 2 to 4 208V 60A power strips
    • Pre-wired with two power cords C13 every 4U (total of 14 cords)
    • Run two blue cat5 copper cables and one green cat5 copper cable from each server location.
    • Two pairs of LC cable to the top of the rack
    • Install a 24 port CAT6 panel and three fiber cassettes for a total of 18 LC ports.
    • Keyboard pull-out
    • IP Remote Console for each cabinet
    • Ganged and leveled at the customer site
    THE SOLUTION (FACTORY INTEGRATED CABINETS)
  • THE RESULT?
    • Customers saves 8 hours per cabinet of integration time. $105,000 in hard costs savings after integration fees.
    • $50,000 savings from Electrical contractor because of fewer runs and all runs being the same
    • Maintained standards as every cabinet was the same (look and feel)
    • Funneled many tasks (and vendors) into a single source. One throat to choke.
    • Met 30 day operational deadline
  • AGENDA
    • Problem areas - power
    • Cooling: Major areas that must be addressed when installing high density equipment
    • Growth of power utilization and heat density: impacts, expecations, and reality
    • Cooling concerns
    • History and future of the data center
    • Data center reliability and availability
    • Case study #1: Data center health check
    • Case study #2: Integrated cabinet solution
    • Case study #3: Disaster recovery site
  • BACKGROUND
    • The Katrina disaster 2000 miles away, a CRAC failure during Indian summer, the temporary unavailability of spot coolers for rental and persistent Hotspots spotlighted the need for better thermal management.
    • The site faced sharply rising energy costs it had not budgeted for. Higher electricity bills cut deeply into the IT department’s budget. The Technology Manager knew he needed a solution to both the thermal and cost problems and evaluated the options available.
  • ISSUES (OPTIONS CONSIDERED)
    • ·Use Hot Aisle/Cold Aisle Layout –
      • This is a normal solution. But here, the locations of all the CRACs at one side of the room, and the inability to shut down the Disaster Recovery function’s IT equipment during a move made this option unattractive.
    • Add Additional CRAC Capacity –
      • Room was almost full
      • No capital budget for buying more CRACs.
      • Plus more CRACs would drive utility bills higher
      • Already had more than enough cooling capacity with his 30-ton units.
    • ·
  • OBJECTIVES OF THE PROJECT
    • Maintain a higher cooling margin during the summer months where the environment immediately surrounding the Data Center exposes the center to temperature extremes.
    • Fix Hotspots in several areas.
    • Save money on electric utilities, with payback from any expense to come in less than a year.
    • Allow remote monitoring of the Data Center’s thermal health.
    • Plan for almost certain expansion of the center’s thermal load.
    • Support and maintenance plan that would insure continued thermal performance.
    • ·
  • IMPLEMENTATION
    • Site Audit to inventory and characterize the IT equipment heat sources and the facility;
    • Simulation using CFD (Computational Fluid Dynamics) modeling to predict heat and airflow of baseline Data Center;
    • Verification of the CFD model against measurements taken during the audit;
    • Iterate to make improvements using the model to determine optimum configuration of passive and active airflow elements;
    • ·
  • CFD MODELING Before After
  • IMPLEMENTATION
    • Installation of sensor network to monitor changes during remaining Installation
    • Install and reconfigure passive and active airflow elements;
    • Verify and recertify room thermal performance;
      • No downtime was incurred during any of the project phases.
    • ·
  • COOLING MARGIN RESULTS The cooling margin of the room was improved by 7ºF at the top of the racks.
  • SUMMARY OF RESULTS More dramatically,virtually all server intake temperatures dropped, some by as much as14ºF.
  • SUMMARY OF UTILITY SAVINGS
    • This particular site cannot take full advantage of its potential energy savings today because it has constant speed components.
    • Facilities executives are considering adding low cost Variable Frequency Drives (VFDs) in the future to benefit fully.
  • SUMMARY
    • Data Center cooling/power technology has remained virtually unchanged for 30 years
    • IT equipment refreshes every 3 years.
    • Power requirements and Heat densities are increasing with each new generation.
    • More raw cooling capacity or adding more lower voltage whips is rarely the right answer.
  • FINAL WORD Meeting the current and near term power needs with High Voltage/Amperage power and targeting the available cooling to where it is needed most, and controlling airflow precisely are the sensible approaches that will pay dividends in equipment uptime, energy costs and real estate.