UPDATING THE OCP
COMPUTE VOLTAGE STEP
RESPONSE SPECIFICATION
William Wu/Director Product Management/
Penguin Computing
A g e n d a
3
● About Penguin Computing (& why we care about
OCP standards)
● Voltage Instability
Under Certain
Workloads
● Discussion
A b o u t P e n g u i n C o m p u t i n g
4
● Home to Scyld® Beowulf cluster software &
bare metal HPC on cloud Penguin Computing
On-Demand™
● Over 300 OCP racks delivered to date based on
Tundra™ Extreme Scale design
● Platinum OCP member, Penguin CTO Phil
Pokorny is HPC representative of the OCP
Incubation Committee
● U.S.-based 20 year old, global provider of HPC hardware,
software, and services
V o l t a g e I n s t a b i l i t y U n d e r D y n a m i c
W o r k l o a d s
5
● Power shelf can’t respond quick enough during
the rapid transition of power state.
● 64 nodes all transition from medium to full
power in same microsecond, 12V rail sagged
● 64 nodes all transition from FULL power to IDLE
in same microsecond, 12V rail surged
● High Speed, Low Latency Fabric
● Work load synchronized within microseconds
between nodes
● Initial application was LAMPS, reproducible in
other workloads
C l u s t e r C o n f i g u r a t i o n w i t h V o l t a g e
I n s t a b i l i t y
6
● Relion OCP1930e
● Dual Intel Xeon E5-2695 v4
● 128GB DDR4-2400MHz (8x16GB)
● Intel Omni-Path 100
● Diskless Boot
● Air-cooled
● 64 Nodes per Rack (up to 96 Nodes based on
configuration)
C l u s t e r C o n f i g u r a t i o n w i t h V o l t a g e
I n s t a b i l i t y
7
● Open Bridge Rack (40 OU), Single
Zone
● 26kW, N+1, IEC60309 560P7-3P,
5Wire, 277/480V, 60A
● Vertiv NetSure Rectifier and Power
Shelf
C l u s t e r C o n f i g u r a t i o n w i t h V o l t a g e
I n s t a b i l i t y
8
● At high node density and correlated workloads, can
exceed slew rate spec for power shelf
● Symptom shows up at scale, seen at 576 nodes; not
seen at smaller node counts
● It’s an HPC thing - 50%
increase in nodes
R o o t C a u s e
9
● Intel datasheet say CPU can draw more power than it says. Exceed
TDP for up to 4 milliseconds
● Cluster power specification based on calculated power budget off
steady TDP value -- We knew it can exceed, but by how much?
● We were put into a position to consider over-provisioning rack
power
● Over-Engineering and investigating which specific RACK
configuration will create a problem, would be wrong decision to
make all things considered
● But….it happened!
C a p a c i t o r S h e l f
1
0
● Slew rate iss, not total power
● Prototype w/ capacitor directly to bus bars,
proved it can be effective
● Designed additional safety feature,
charge/discharge circuit
● 1 Farad (1 million uF) shelf
● It works, and passed FCC/UL.
● New Design: Capacitor Shelf “Power Buffering
Solution”
F u t u r e U p g r a d e
11
● Incorporated the capacitor shelf into each
individual node so rather than one large
capacitor, it’s many smaller capacitor per node.
● Discussion with power supply MFR, to discuss
increase slew rate for HPC applications for their
technology
● Recommend update to OCP Open Rack power
specification to increase slew rate spec.
(~2000A /µsec)
12
Discussion
Ocp  updating the ocp compute voltage step response specification

Ocp updating the ocp compute voltage step response specification

  • 2.
    UPDATING THE OCP COMPUTEVOLTAGE STEP RESPONSE SPECIFICATION William Wu/Director Product Management/ Penguin Computing
  • 3.
    A g en d a 3 ● About Penguin Computing (& why we care about OCP standards) ● Voltage Instability Under Certain Workloads ● Discussion
  • 4.
    A b ou t P e n g u i n C o m p u t i n g 4 ● Home to Scyld® Beowulf cluster software & bare metal HPC on cloud Penguin Computing On-Demand™ ● Over 300 OCP racks delivered to date based on Tundra™ Extreme Scale design ● Platinum OCP member, Penguin CTO Phil Pokorny is HPC representative of the OCP Incubation Committee ● U.S.-based 20 year old, global provider of HPC hardware, software, and services
  • 5.
    V o lt a g e I n s t a b i l i t y U n d e r D y n a m i c W o r k l o a d s 5 ● Power shelf can’t respond quick enough during the rapid transition of power state. ● 64 nodes all transition from medium to full power in same microsecond, 12V rail sagged ● 64 nodes all transition from FULL power to IDLE in same microsecond, 12V rail surged ● High Speed, Low Latency Fabric ● Work load synchronized within microseconds between nodes ● Initial application was LAMPS, reproducible in other workloads
  • 6.
    C l us t e r C o n f i g u r a t i o n w i t h V o l t a g e I n s t a b i l i t y 6 ● Relion OCP1930e ● Dual Intel Xeon E5-2695 v4 ● 128GB DDR4-2400MHz (8x16GB) ● Intel Omni-Path 100 ● Diskless Boot ● Air-cooled ● 64 Nodes per Rack (up to 96 Nodes based on configuration)
  • 7.
    C l us t e r C o n f i g u r a t i o n w i t h V o l t a g e I n s t a b i l i t y 7 ● Open Bridge Rack (40 OU), Single Zone ● 26kW, N+1, IEC60309 560P7-3P, 5Wire, 277/480V, 60A ● Vertiv NetSure Rectifier and Power Shelf
  • 8.
    C l us t e r C o n f i g u r a t i o n w i t h V o l t a g e I n s t a b i l i t y 8 ● At high node density and correlated workloads, can exceed slew rate spec for power shelf ● Symptom shows up at scale, seen at 576 nodes; not seen at smaller node counts ● It’s an HPC thing - 50% increase in nodes
  • 9.
    R o ot C a u s e 9 ● Intel datasheet say CPU can draw more power than it says. Exceed TDP for up to 4 milliseconds ● Cluster power specification based on calculated power budget off steady TDP value -- We knew it can exceed, but by how much? ● We were put into a position to consider over-provisioning rack power ● Over-Engineering and investigating which specific RACK configuration will create a problem, would be wrong decision to make all things considered ● But….it happened!
  • 10.
    C a pa c i t o r S h e l f 1 0 ● Slew rate iss, not total power ● Prototype w/ capacitor directly to bus bars, proved it can be effective ● Designed additional safety feature, charge/discharge circuit ● 1 Farad (1 million uF) shelf ● It works, and passed FCC/UL. ● New Design: Capacitor Shelf “Power Buffering Solution”
  • 11.
    F u tu r e U p g r a d e 11 ● Incorporated the capacitor shelf into each individual node so rather than one large capacitor, it’s many smaller capacitor per node. ● Discussion with power supply MFR, to discuss increase slew rate for HPC applications for their technology ● Recommend update to OCP Open Rack power specification to increase slew rate spec. (~2000A /µsec)
  • 12.