SlideShare a Scribd company logo
1 of 20
Download to read offline
Immersion optimized
IT platforms
Implementing OCP Design Guidelines
for Immersion-Cooled IT Equipment
Immersion optimized IT
platforms
Implementing OCP Design Guidelines for
Immersion-Cooled IT Equipment
Rolf Brink, CEO, Asperitas
(Rolf.Brink@asperitas.com)
Server/ACS
Asperitas Immersion Technology
Asperitas AIC24-15/21”
Passive immersion technology
Shell Immersion Cooling Fluid S5 X (Hydrocarbon)
Power density:
• Compute density: 45 kW/m2@32°C
High availability immersion solution
• Dual power and cooling integration
• Self contained and self sustained (gravity driven)
• Full monitoring and autonomous safety (monitoring, alarming, control)
Thermally optimized for platform focus
• Immersion optimized IT platforms
• Serviceable IT equipment
• Fully warranted solutions by OEMs
ADVANCED
COOLING
SOLUTIONS
SERVER
Asperitas certification
• Process facilitates and executes OCP “Design Guidelines for Immersion-Cooled IT Equipment”
• Standardized on Asperitas OCP Open Cassette design
• OEM focused collaborations, facilitating full warranted solutions
Requirement
Density
Performance
Thermal
System
specifications
Thermal
design
System
design
Level 1
Feasibility
System build
Material
analysis
Thermal
performance
Level 2
Prototype
Validated
system
design
Thermal
certification
Duration test
Level 3
Supported
OEM/vendor
platform
Platform
optimisation
OEM support
alignment
OEM
redesign
OEM product
lifecycle
ADVANCED
COOLING
SOLUTIONS
SERVER
OCP Asperitas Open Cassette SPEC (AOC) Virtual Summit 2020
https://www.opencompute.org/documents/20200227-open-cassettes-specification-v1-0-pub-pdf
Asperitas Open Cassette SPEC ADVANCED
COOLING
SOLUTIONS
SERVER
Engineering optimized platforms
Edge platform
• High cooling temperature tolerance (48°C)
• Medium CPU density (4x AMD EPYC/1U)
• Minimized footprint (15” chassis)
Enterprise mainstream platform
• High availability implementation
• High overall efficiency
• Highly serviceable
HPC platform
• High performance implementation
• CPU & GPU dense
• High overall efficiency
ADVANCED
COOLING
SOLUTIONS
SERVER
Target
Dimensions
1U/15”
2U/21”
1U/21”
Platform
SMC/AMD
Data processing
Dell/AMD/Intel
High availability
Penguin/Gigabyte/
Intel/NVIDIA
AI/Machine learning
Original platform
2124BT-HTR (BigTwin)
Dell C6525/C6420
Relion XO1114GTS
Edge thermal and fluid optimization
1U 15”Open cassette implementation
• High liquid flow abilities
• Limited space for components
• One board “upside down” (no upside down)
• Small formfactor PSU
Main thermal sources
• CPUs (180W)
• 80Plus PSU (10% loss margin calculated)
Design decisions
• Fixed board mounting instead of sleds
• Reliability: PSU thermal tolerance 60°C, positioning in bottom
• Performance: CPU’s as low as possible above PSU
• Accelerate thermal shadowing to minimize impact
• Custom designed power delivery infrastructure by SuperMicro
ADVANCED
COOLING
SOLUTIONS
SERVER
Enterprise system optimization
2U 21”OCP open cassette adaptation
• High liquid flow abilities
• Flexible space for components
• Adapted built-in blade slots
Main components sources
• Dell C6525 and 6420 blades
• Dell original PSU’s
Design decisions
• Single side servicing: remove drive bays (on-board storage)
• Extended blade designs for lowered positioning
• Original backplane designs re-positioned
• Off the shelf blades support
• Custom firmware/system management by Dell
ADVANCED
COOLING
SOLUTIONS
SERVER
HPC design optimization
1U 21”open cassette adaptation
• High liquid flow abilities
• Flexible space for components
Performance optimization
• Focus on GPU performance: Lowest position in chassis
• CPU’s as low as possible in remaining space
Density optimizations
• Power shelf to chassis PSU integration (17% of space in tank)
• 1 OU converted to 1 U to increase density (8% of space in tank)
Design decisions
• Custom PDB by Gigabyte
• Custom cabling by Gigabyte
ADVANCED
COOLING
SOLUTIONS
SERVER
Material compatibility study
Mainboard and PSU focused
• Only compatible cabling used
• Capacitors (potential rubber sealing)
• Thermal compounds (potential to dissolve)
• Labels (Potential to dissolve glue or ink)
• Fan simulator application (removal of fan)
Test methods
• Material sheets not always available
• Visual (high res photo) analysis before and after immersion
• Dielectric thermal bath testing, liquid analysis after test (Shell)
• 100x rapid power switching (PSU relay)
• Visual check confirmation before and after immersion
Duration testing
• Continuous logging of thermal properties
• Continuous logging of system performance
ADVANCED
COOLING
SOLUTIONS
SERVER
Liquid analysis
Compatibility tests experiments:
• Immersing plastics, elastomers, metals and parts under
controlled temperatures (room temp, 50, 80 and 100°C)
• For some weeks or months (accelerated test)
• Measuring component weight & volume behaviour ICP
Dielectric Breakdown Voltage
Integrated fluid analytics:
• ICP: Identify metals and elements originally
not in the fluid
• FTIR: Materials spectrums comparison /
Impurities
• Dielectric breakdown: Thermal Fluid
performance
Source: Shell Technology Centres SST/STCHa
ADVANCED
COOLING
SOLUTIONS
SERVER
Thermal optimization
Thermal shadowing
• Focus on GPU workloads, all GPU’s parallel
• PSU as critical component, placement parallel to GPU’s
• CPU’s secondary workload focus, in GPU shadow
Other critical components
• High utilization of SSD’s
• Placement in intermediate temperatures
• GPU shadow acceptable
• Infiniband network adapters
• High temperature tolerant
• Placement in high temperature layers
• Ports extended for service access
ADVANCED
COOLING
SOLUTIONS
SERVER
Thermal performance (max populated) ADVANCED
COOLING
SOLUTIONS
SERVER
Performance testing at scale ADVANCED
COOLING
SOLUTIONS
SERVER
Description
AVG
55/CPU1
FREQ
55/CPU1
AVG
55/CPU2
FREQ
55/CPU2
AVG
57/CPU1
FREQ
57/CPU1
AVG
57/CPU2
FREQ
57/CPU2
---Test19 throttling phase 1 84,2° 2556 Mhz 83,5° 2792 Mhz 86,5° 2529 Mhz 88,0° 2603 Mhz
---Test19 throttling phase 2 87,6° 2546 Mhz 87,5° 2744 Mhz 88,2° 2436 Mhz 89,9° 2500 Mhz
---Test19 throttling phase 3 88,8° 2550 Mhz 89,2° 2717 Mhz 88,7° 2373 Mhz 90,0° 2414 Mhz
---Test19 throttling phase 4 89,8° 2481 Mhz 89,8° 2646 Mhz 89,1° 2277 Mhz 90,0° 2324 Mhz
---Test19 throttling phase 5 90,0° 2395 Mhz 89,8° 2576 Mhz 89,5° 2206 Mhz 90,0° 2255 Mhz
---Test19 throttling phase 6 90,0° 2268 Mhz 89,9° 2541 Mhz 90,0° 2083 Mhz 90,0° 2121 Mhz
---Test19 throttling phase 7 90,0° 2161 Mhz 89,9° 2447 Mhz 90,0° 1872 Mhz 89,6° 2625 Mhz
CPU data AMD EPYC™ 7742
Max Boost 3400 Mhz
Boost TDP 225 W
Tcase Max 81°C
Tctl Throttle 94°
Tctl Max 100°
0°C
10°C
20°C
30°C
40°C
50°C
60°C
1700 W1900 W2100 W2300 W2500 W2700 W
Platform power vs thermal tolerance
estimation
High Density Enterprise
CPU variations Boost vs base clock ADVANCED
COOLING
SOLUTIONS
SERVER
CPU Temperature fluctuations
(min/max) over 24 hour test cycle
using Stresslinux
Full boost
Base clock (-20W)
Heatsink optimization: fin pitch only
Other OCP platform optimizations
Tioga Pass compute platform
Compute platform
Olympus platform
ADVANCED
COOLING
SOLUTIONS
SERVER
Conclusion
Improved system capabilities are unlocked by platform optimization
Optimization of platforms benefits from collaborative approach
Differentiation between workloads allows optimization for desired goals
• Thermal
• HPC/High Density
• High availability
Thorough material compatibility studies improve predictability
Tests at scale allow for identification of further optimization steps
ADVANCED
COOLING
SOLUTIONS
SERVER
Call to Action
• Join and contribute immersion related content in ACS Immersion, Server and ACF groups
• Collaborate on optimization of IT platforms for immersion
• Cross-pollination within OCP builds an effective ecosystem
• Sharing knowledge helps increase the immersion potential
• More information: Rolf.Brink@Asperitas.com
• Design Guidelines for Immersion-Cooled IT Equipment: Keep an eye out on the immersion wiki and mailinglist!
• Open cassette spec: https://www.opencompute.org/documents/20200227-open-cassettes-specification-v1-0-pub-pdf
• ACS Immersion:
• Project Wiki with latest information:
https://www.opencompute.org/wiki/Rack_%26_Power/Advanced_Cooling_Solutions_Immersion_Cooling
• Mailing list: http://lists.opencompute.org/mailman/listinfo/opencompute-acsimmersion
ADVANCED
COOLING
SOLUTIONS
SERVER
Thank you!

More Related Content

What's hot

ClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proof
ClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proofClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proof
ClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proof
Hennie Verster
 
ClimateWizard_Brochure_AUS_CW005_REVE_0515 NEW
ClimateWizard_Brochure_AUS_CW005_REVE_0515 NEWClimateWizard_Brochure_AUS_CW005_REVE_0515 NEW
ClimateWizard_Brochure_AUS_CW005_REVE_0515 NEW
Hennie Verster
 
LEED 301 - ASE,WSE,Kyoto,Munters
LEED 301 - ASE,WSE,Kyoto,MuntersLEED 301 - ASE,WSE,Kyoto,Munters
LEED 301 - ASE,WSE,Kyoto,Munters
Norival Corrêa
 
Niagara WSAC application solutions brochure - final - 0815
Niagara WSAC application solutions brochure - final - 0815Niagara WSAC application solutions brochure - final - 0815
Niagara WSAC application solutions brochure - final - 0815
Joseph Merzlak
 
E infochips mechanical capabilities
E infochips mechanical capabilitiesE infochips mechanical capabilities
E infochips mechanical capabilities
Abhishek Kumar
 

What's hot (20)

Application of adiabactic gas coolers for refrigerated warehouse applications
Application of adiabactic gas coolers for refrigerated warehouse applicationsApplication of adiabactic gas coolers for refrigerated warehouse applications
Application of adiabactic gas coolers for refrigerated warehouse applications
 
LiquidCool Solutions - NREL test results!
LiquidCool Solutions - NREL test results! LiquidCool Solutions - NREL test results!
LiquidCool Solutions - NREL test results!
 
LiquidCool Solutions Value for Facebook
LiquidCool Solutions Value for FacebookLiquidCool Solutions Value for Facebook
LiquidCool Solutions Value for Facebook
 
Mechanical Engineering Seminar 2017_3
Mechanical Engineering Seminar 2017_3Mechanical Engineering Seminar 2017_3
Mechanical Engineering Seminar 2017_3
 
ClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proof
ClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proofClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proof
ClimateWizard_ReferenceBook_AUS_CW042_REVE_0615_proof
 
ClimateWizard_Brochure_AUS_CW005_REVE_0515 NEW
ClimateWizard_Brochure_AUS_CW005_REVE_0515 NEWClimateWizard_Brochure_AUS_CW005_REVE_0515 NEW
ClimateWizard_Brochure_AUS_CW005_REVE_0515 NEW
 
SMR-160. A Safe and Secure Nuclear Energy Future for Ukraine
SMR-160. A Safe and Secure Nuclear Energy Future for UkraineSMR-160. A Safe and Secure Nuclear Energy Future for Ukraine
SMR-160. A Safe and Secure Nuclear Energy Future for Ukraine
 
LEED 301 - ASE,WSE,Kyoto,Munters
LEED 301 - ASE,WSE,Kyoto,MuntersLEED 301 - ASE,WSE,Kyoto,Munters
LEED 301 - ASE,WSE,Kyoto,Munters
 
Thermal power generator
 Thermal power generator Thermal power generator
Thermal power generator
 
Lcp hybrid –_effi_cient_performance_with_heat_pipe_technology
Lcp hybrid –_effi_cient_performance_with_heat_pipe_technologyLcp hybrid –_effi_cient_performance_with_heat_pipe_technology
Lcp hybrid –_effi_cient_performance_with_heat_pipe_technology
 
Raypak
RaypakRaypak
Raypak
 
Sb in rack-ver.1.0_en
Sb in rack-ver.1.0_enSb in rack-ver.1.0_en
Sb in rack-ver.1.0_en
 
Future-Proof Cooling: Rack Containment for Scalability and Efficiency
Future-Proof Cooling: Rack Containment for Scalability and EfficiencyFuture-Proof Cooling: Rack Containment for Scalability and Efficiency
Future-Proof Cooling: Rack Containment for Scalability and Efficiency
 
Niagara WSAC application solutions brochure - final - 0815
Niagara WSAC application solutions brochure - final - 0815Niagara WSAC application solutions brochure - final - 0815
Niagara WSAC application solutions brochure - final - 0815
 
Nuberg Gas Plants - Corporate Overview. Manufacturer of Hydrogen, PSA - Nitro...
Nuberg Gas Plants - Corporate Overview. Manufacturer of Hydrogen, PSA - Nitro...Nuberg Gas Plants - Corporate Overview. Manufacturer of Hydrogen, PSA - Nitro...
Nuberg Gas Plants - Corporate Overview. Manufacturer of Hydrogen, PSA - Nitro...
 
From Waste to Alternative Fuel
From Waste to Alternative FuelFrom Waste to Alternative Fuel
From Waste to Alternative Fuel
 
E infochips mechanical capabilities
E infochips mechanical capabilitiesE infochips mechanical capabilities
E infochips mechanical capabilities
 
Optimizing the Steam Generation Cycle and Condensate Recovery Process
Optimizing the Steam Generation Cycle and Condensate Recovery ProcessOptimizing the Steam Generation Cycle and Condensate Recovery Process
Optimizing the Steam Generation Cycle and Condensate Recovery Process
 
Adoption of supercritical technology (1)
Adoption of supercritical technology (1)Adoption of supercritical technology (1)
Adoption of supercritical technology (1)
 
EVAPCO NET ZERO Cooling Tower
EVAPCO NET ZERO Cooling TowerEVAPCO NET ZERO Cooling Tower
EVAPCO NET ZERO Cooling Tower
 

Similar to OCP Tech Week Immersion Cooling Optimized IT platforms

Thermal and airflow modeling methodology for Desktop PC
Thermal and airflow modeling methodology for Desktop PCThermal and airflow modeling methodology for Desktop PC
Thermal and airflow modeling methodology for Desktop PC
Jeehoon Choi
 
AccuThermo AW 410 Rapid Thermal Processing Equipment
AccuThermo AW 410 Rapid Thermal Processing EquipmentAccuThermo AW 410 Rapid Thermal Processing Equipment
AccuThermo AW 410 Rapid Thermal Processing Equipment
Peter Chen
 

Similar to OCP Tech Week Immersion Cooling Optimized IT platforms (20)

High Efficiency Solutions for process cooling
High Efficiency Solutions for process coolingHigh Efficiency Solutions for process cooling
High Efficiency Solutions for process cooling
 
Thermal and airflow modeling methodology for Desktop PC
Thermal and airflow modeling methodology for Desktop PCThermal and airflow modeling methodology for Desktop PC
Thermal and airflow modeling methodology for Desktop PC
 
Lenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
Lenovo HPC: Energy Efficiency and Water-Cool-Technology InnovationsLenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
Lenovo HPC: Energy Efficiency and Water-Cool-Technology Innovations
 
Accu thermo aw 610 rapid thermal processor
Accu thermo aw 610 rapid thermal processorAccu thermo aw 610 rapid thermal processor
Accu thermo aw 610 rapid thermal processor
 
AccuThermo AW 820V Vacuum Rapid Thermal Anneal Equipment
AccuThermo AW 820V Vacuum  Rapid Thermal Anneal EquipmentAccuThermo AW 820V Vacuum  Rapid Thermal Anneal Equipment
AccuThermo AW 820V Vacuum Rapid Thermal Anneal Equipment
 
What's Next in Cooling: Capacity, Containment, & More
What's Next in Cooling: Capacity, Containment, & MoreWhat's Next in Cooling: Capacity, Containment, & More
What's Next in Cooling: Capacity, Containment, & More
 
Data Center Cooling, Critical Facility and Infrastructure Optimization
Data Center Cooling, Critical Facility and Infrastructure OptimizationData Center Cooling, Critical Facility and Infrastructure Optimization
Data Center Cooling, Critical Facility and Infrastructure Optimization
 
OCP liquid direct to chip temperature guideline.pdf
OCP liquid direct to chip temperature guideline.pdfOCP liquid direct to chip temperature guideline.pdf
OCP liquid direct to chip temperature guideline.pdf
 
Fast & Easy Electronics Thermal Management
Fast & Easy Electronics Thermal ManagementFast & Easy Electronics Thermal Management
Fast & Easy Electronics Thermal Management
 
How IT Decisions Impact Facilities: The Importance of Mutual Understanding
How IT Decisions Impact Facilities: The Importance of Mutual UnderstandingHow IT Decisions Impact Facilities: The Importance of Mutual Understanding
How IT Decisions Impact Facilities: The Importance of Mutual Understanding
 
Final pwpt Presentation
Final pwpt PresentationFinal pwpt Presentation
Final pwpt Presentation
 
CFD Analysis with AcuSolve and ultraFluidX
CFD Analysis with AcuSolve and ultraFluidXCFD Analysis with AcuSolve and ultraFluidX
CFD Analysis with AcuSolve and ultraFluidX
 
Ghasem zolfaghari fyp
Ghasem zolfaghari fypGhasem zolfaghari fyp
Ghasem zolfaghari fyp
 
Evaluating the Thermal Performance of Lighting Solutions With Cloud-Based CFD
Evaluating the Thermal Performance of Lighting Solutions With Cloud-Based CFDEvaluating the Thermal Performance of Lighting Solutions With Cloud-Based CFD
Evaluating the Thermal Performance of Lighting Solutions With Cloud-Based CFD
 
How to build a state-of-the-art rails cluster
How to build a state-of-the-art rails clusterHow to build a state-of-the-art rails cluster
How to build a state-of-the-art rails cluster
 
Allwin21 and main products
Allwin21 and main productsAllwin21 and main products
Allwin21 and main products
 
Effective Low Cost Alternative To Liquid Cooling
Effective Low Cost Alternative To Liquid CoolingEffective Low Cost Alternative To Liquid Cooling
Effective Low Cost Alternative To Liquid Cooling
 
Cosmodyne aspen 1000
Cosmodyne aspen 1000Cosmodyne aspen 1000
Cosmodyne aspen 1000
 
P y un50_00
P y un50_00P y un50_00
P y un50_00
 
AccuThermo AW 410 Rapid Thermal Processing Equipment
AccuThermo AW 410 Rapid Thermal Processing EquipmentAccuThermo AW 410 Rapid Thermal Processing Equipment
AccuThermo AW 410 Rapid Thermal Processing Equipment
 

Recently uploaded

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 

OCP Tech Week Immersion Cooling Optimized IT platforms

  • 1. Immersion optimized IT platforms Implementing OCP Design Guidelines for Immersion-Cooled IT Equipment
  • 2. Immersion optimized IT platforms Implementing OCP Design Guidelines for Immersion-Cooled IT Equipment Rolf Brink, CEO, Asperitas (Rolf.Brink@asperitas.com) Server/ACS
  • 3. Asperitas Immersion Technology Asperitas AIC24-15/21” Passive immersion technology Shell Immersion Cooling Fluid S5 X (Hydrocarbon) Power density: • Compute density: 45 kW/m2@32°C High availability immersion solution • Dual power and cooling integration • Self contained and self sustained (gravity driven) • Full monitoring and autonomous safety (monitoring, alarming, control) Thermally optimized for platform focus • Immersion optimized IT platforms • Serviceable IT equipment • Fully warranted solutions by OEMs ADVANCED COOLING SOLUTIONS SERVER
  • 4. Asperitas certification • Process facilitates and executes OCP “Design Guidelines for Immersion-Cooled IT Equipment” • Standardized on Asperitas OCP Open Cassette design • OEM focused collaborations, facilitating full warranted solutions Requirement Density Performance Thermal System specifications Thermal design System design Level 1 Feasibility System build Material analysis Thermal performance Level 2 Prototype Validated system design Thermal certification Duration test Level 3 Supported OEM/vendor platform Platform optimisation OEM support alignment OEM redesign OEM product lifecycle ADVANCED COOLING SOLUTIONS SERVER
  • 5. OCP Asperitas Open Cassette SPEC (AOC) Virtual Summit 2020 https://www.opencompute.org/documents/20200227-open-cassettes-specification-v1-0-pub-pdf Asperitas Open Cassette SPEC ADVANCED COOLING SOLUTIONS SERVER
  • 6. Engineering optimized platforms Edge platform • High cooling temperature tolerance (48°C) • Medium CPU density (4x AMD EPYC/1U) • Minimized footprint (15” chassis) Enterprise mainstream platform • High availability implementation • High overall efficiency • Highly serviceable HPC platform • High performance implementation • CPU & GPU dense • High overall efficiency ADVANCED COOLING SOLUTIONS SERVER Target Dimensions 1U/15” 2U/21” 1U/21” Platform SMC/AMD Data processing Dell/AMD/Intel High availability Penguin/Gigabyte/ Intel/NVIDIA AI/Machine learning Original platform 2124BT-HTR (BigTwin) Dell C6525/C6420 Relion XO1114GTS
  • 7. Edge thermal and fluid optimization 1U 15”Open cassette implementation • High liquid flow abilities • Limited space for components • One board “upside down” (no upside down) • Small formfactor PSU Main thermal sources • CPUs (180W) • 80Plus PSU (10% loss margin calculated) Design decisions • Fixed board mounting instead of sleds • Reliability: PSU thermal tolerance 60°C, positioning in bottom • Performance: CPU’s as low as possible above PSU • Accelerate thermal shadowing to minimize impact • Custom designed power delivery infrastructure by SuperMicro ADVANCED COOLING SOLUTIONS SERVER
  • 8. Enterprise system optimization 2U 21”OCP open cassette adaptation • High liquid flow abilities • Flexible space for components • Adapted built-in blade slots Main components sources • Dell C6525 and 6420 blades • Dell original PSU’s Design decisions • Single side servicing: remove drive bays (on-board storage) • Extended blade designs for lowered positioning • Original backplane designs re-positioned • Off the shelf blades support • Custom firmware/system management by Dell ADVANCED COOLING SOLUTIONS SERVER
  • 9. HPC design optimization 1U 21”open cassette adaptation • High liquid flow abilities • Flexible space for components Performance optimization • Focus on GPU performance: Lowest position in chassis • CPU’s as low as possible in remaining space Density optimizations • Power shelf to chassis PSU integration (17% of space in tank) • 1 OU converted to 1 U to increase density (8% of space in tank) Design decisions • Custom PDB by Gigabyte • Custom cabling by Gigabyte ADVANCED COOLING SOLUTIONS SERVER
  • 10. Material compatibility study Mainboard and PSU focused • Only compatible cabling used • Capacitors (potential rubber sealing) • Thermal compounds (potential to dissolve) • Labels (Potential to dissolve glue or ink) • Fan simulator application (removal of fan) Test methods • Material sheets not always available • Visual (high res photo) analysis before and after immersion • Dielectric thermal bath testing, liquid analysis after test (Shell) • 100x rapid power switching (PSU relay) • Visual check confirmation before and after immersion Duration testing • Continuous logging of thermal properties • Continuous logging of system performance ADVANCED COOLING SOLUTIONS SERVER
  • 11. Liquid analysis Compatibility tests experiments: • Immersing plastics, elastomers, metals and parts under controlled temperatures (room temp, 50, 80 and 100°C) • For some weeks or months (accelerated test) • Measuring component weight & volume behaviour ICP Dielectric Breakdown Voltage Integrated fluid analytics: • ICP: Identify metals and elements originally not in the fluid • FTIR: Materials spectrums comparison / Impurities • Dielectric breakdown: Thermal Fluid performance Source: Shell Technology Centres SST/STCHa ADVANCED COOLING SOLUTIONS SERVER
  • 12. Thermal optimization Thermal shadowing • Focus on GPU workloads, all GPU’s parallel • PSU as critical component, placement parallel to GPU’s • CPU’s secondary workload focus, in GPU shadow Other critical components • High utilization of SSD’s • Placement in intermediate temperatures • GPU shadow acceptable • Infiniband network adapters • High temperature tolerant • Placement in high temperature layers • Ports extended for service access ADVANCED COOLING SOLUTIONS SERVER
  • 13. Thermal performance (max populated) ADVANCED COOLING SOLUTIONS SERVER
  • 14. Performance testing at scale ADVANCED COOLING SOLUTIONS SERVER Description AVG 55/CPU1 FREQ 55/CPU1 AVG 55/CPU2 FREQ 55/CPU2 AVG 57/CPU1 FREQ 57/CPU1 AVG 57/CPU2 FREQ 57/CPU2 ---Test19 throttling phase 1 84,2° 2556 Mhz 83,5° 2792 Mhz 86,5° 2529 Mhz 88,0° 2603 Mhz ---Test19 throttling phase 2 87,6° 2546 Mhz 87,5° 2744 Mhz 88,2° 2436 Mhz 89,9° 2500 Mhz ---Test19 throttling phase 3 88,8° 2550 Mhz 89,2° 2717 Mhz 88,7° 2373 Mhz 90,0° 2414 Mhz ---Test19 throttling phase 4 89,8° 2481 Mhz 89,8° 2646 Mhz 89,1° 2277 Mhz 90,0° 2324 Mhz ---Test19 throttling phase 5 90,0° 2395 Mhz 89,8° 2576 Mhz 89,5° 2206 Mhz 90,0° 2255 Mhz ---Test19 throttling phase 6 90,0° 2268 Mhz 89,9° 2541 Mhz 90,0° 2083 Mhz 90,0° 2121 Mhz ---Test19 throttling phase 7 90,0° 2161 Mhz 89,9° 2447 Mhz 90,0° 1872 Mhz 89,6° 2625 Mhz CPU data AMD EPYC™ 7742 Max Boost 3400 Mhz Boost TDP 225 W Tcase Max 81°C Tctl Throttle 94° Tctl Max 100° 0°C 10°C 20°C 30°C 40°C 50°C 60°C 1700 W1900 W2100 W2300 W2500 W2700 W Platform power vs thermal tolerance estimation High Density Enterprise
  • 15. CPU variations Boost vs base clock ADVANCED COOLING SOLUTIONS SERVER CPU Temperature fluctuations (min/max) over 24 hour test cycle using Stresslinux Full boost Base clock (-20W)
  • 17. Other OCP platform optimizations Tioga Pass compute platform Compute platform Olympus platform ADVANCED COOLING SOLUTIONS SERVER
  • 18. Conclusion Improved system capabilities are unlocked by platform optimization Optimization of platforms benefits from collaborative approach Differentiation between workloads allows optimization for desired goals • Thermal • HPC/High Density • High availability Thorough material compatibility studies improve predictability Tests at scale allow for identification of further optimization steps ADVANCED COOLING SOLUTIONS SERVER
  • 19. Call to Action • Join and contribute immersion related content in ACS Immersion, Server and ACF groups • Collaborate on optimization of IT platforms for immersion • Cross-pollination within OCP builds an effective ecosystem • Sharing knowledge helps increase the immersion potential • More information: Rolf.Brink@Asperitas.com • Design Guidelines for Immersion-Cooled IT Equipment: Keep an eye out on the immersion wiki and mailinglist! • Open cassette spec: https://www.opencompute.org/documents/20200227-open-cassettes-specification-v1-0-pub-pdf • ACS Immersion: • Project Wiki with latest information: https://www.opencompute.org/wiki/Rack_%26_Power/Advanced_Cooling_Solutions_Immersion_Cooling • Mailing list: http://lists.opencompute.org/mailman/listinfo/opencompute-acsimmersion ADVANCED COOLING SOLUTIONS SERVER