SlideShare a Scribd company logo
1 of 358
Download to read offline
DOAG 2020 │ ©2020 VMware, Inc.
ESXi Performance
Principles
DOAG Edition
Valentin Bondzio
Sr. Staff TSE / GSS Premier Services
2020-01-23
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 2
Brief Intro
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 3
Brief Intro
@VMware since 2009
Global Support Services / Premier Services
Focus on Resource Management, Performance and Windows Internals
Originally from Berlin, living in Ireland since 2007
And most importantly …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 4
Brief Intro
Not an Oracle expert !
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 5
Brief Intro
Not an Oracle expert !
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Agenda
6
CPU Scheduling and Usage Accounting
The “basics”
“Power Management”
The Good, the Better and the Ugly
ESXi Memory Management
More “basics”
Local resource distribution
What else is running on ESXi
CPU Topology Abstraction
CPU Socket != NUMA node
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Agenda
7
CPU Scheduling and Usage Accounting
The “basics”
“Power Management”
The Good, the Better and the Ugly
ESXi Memory Management
More “basics”
Local resource distribution
What else is running on ESXi
CPU Topology Abstraction
CPU Socket != NUMA node
+I/O stuff
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Agenda
8
CPU Scheduling and Usage Accounting
The “basics”
“Power Management”
The Good, the Better and the Ugly
ESXi Memory Management
More “basics”
Local resource distribution
What else is running on ESXi
CPU Topology Abstraction
CPU Socket != NUMA node
+I/O stuff
+vMotion
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Agenda
9
CPU Scheduling and Usage Accounting
The “basics”
“Power Management”
The Good, the Better and the Ugly
ESXi Memory Management
More “basics”
Local resource distribution
What else is running on ESXi
CPU Topology Abstraction
CPU Socket != NUMA node
+I/O stuff
+vMotion
+Backup
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 10
Resource guarantees and weighting (shares) on a per VM or “Resource Pool” level
CPU Scheduler Overview
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 11
Dispatch VMs (its “worlds”) to honor CPU settings (Local)
CPU Scheduler Overview
What does the scheduler do?
vCPU
HT / Core
vCPU
vCPU
vCPU vCPU vCPU
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 12
Dispatch VMs (its “worlds”) to honor CPU settings (Local)
• For fairness: select VM with the least (consumed CPU time / fair share)
CPU Scheduler Overview
What does the scheduler do?
vCPU
HT / Core
vCPU
vCPU
vCPU vCPU vCPU
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 13
Dispatch VMs (its “worlds”) to honor CPU settings (Local)
• For fairness: select VM with the least (consumed CPU time / fair share)
• For priority: run latency-sensitive VM (high) before anyone else
CPU Scheduler Overview
What does the scheduler do?
vCPU
HT / Core
vCPU vCPU
vCPU
vCPU vCPU vCPU
IO
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 14
LLC
Place the worlds / threads on physical CPUs (Global)
CPU Scheduler Overview
What does the scheduler do?
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
LLC
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 15
LLC
Place the worlds / threads on physical CPUs (Global)
CPU Scheduler Overview
What does the scheduler do?
• To balance load across physical execution contexts (PCPUs)
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
LLC
VM VM VM VM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 16
LLC
Place the worlds / threads on physical CPUs (Global)
CPU Scheduler Overview
What does the scheduler do?
• To balance load across physical execution contexts (PCPUs)
• To preserve cache state, minimize migration cost
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
LLC
VM VM VM VM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 17
LLC
Place the worlds / threads on physical CPUs (Global)
CPU Scheduler Overview
What does the scheduler do?
• To balance load across physical execution contexts (PCPUs)
• To preserve cache state, minimize migration cost
• To avoid contention from hardware (HT, LLC, etc.) and sibling vCPUs (from the same VM)
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
LLC
VM VM VM VM VM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 18
LLC
Place the worlds / threads on physical CPUs (Global)
CPU Scheduler Overview
What does the scheduler do?
• To balance load across physical execution contexts (PCPUs)
• To preserve cache state, minimize migration cost
• To avoid contention from hardware (HT, LLC, etc.) and sibling vCPUs (from the same VM)
• To keep VMs or threads that have frequent communications close to each other
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
Core
HT 0
HT 1
LLC
VM VM VM VM
VM VM
VM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 19
CPU Scheduler Overview
How does that look?
10:10:29am up 2 days 48 min, 674 worlds, 1 VMs, 2 vCPUs; CPU load average: 0.02, 0.01, 0.01
PCPU USED(%): 0.3 0.1 0.0 0.3 0.2 0.1 0.0 0.0 0.0 0.2 50 50 4.1 0.1 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 AVG: 4.4
PCPU UTIL(%): 0.5 0.1 0.1 0.6 0.2 0.2 0.0 0.2 0.0 0.3 100 100 4.2 0.2 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.2 0.1 0.1 AVG: 8.6
CORE UTIL(%): 0.6 0.7 0.4 0.9 0.3 100 4.3 0.2 0.0 0.1 0.4 0.7 AVG: 9.1
ID GID NAME NWLD %USED %RUN %SYS %WAIT %VMWAIT %RDY %IDLE %OVRLP
96337 148153 vmx 1 0.02 0.01 0.02 61.82 - 37.86 0.00 0.00
96339 148153 NetWorld-VM-96338 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96340 148153 NUMASchedRemapEpochInitial 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96341 148153 vmast.96338 1 0.03 0.05 0.00 99.63 - 0.00 0.00 0.00
96343 148153 vmx-vthread-6 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96344 148153 vmx-mks:Debian86 1 0.00 0.00 0.00 61.55 - 38.13 0.00 0.00
96345 148153 vmx-svga:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96346 148153 vmx-vcpu-0:Debian86 1 62.35 99.68 0.00 0.00 0.00 0.00 0.00 0.05
96348 148153 vmx-vcpu-1:Debian86 1 62.36 99.67 0.00 0.00 0.00 0.01 0.00 0.05
96347 148153 PVSCSI-96338:0 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96350 148153 vmx-vthread-7:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 20
CPU Scheduler Overview
How does that look?
10:10:29am up 2 days 48 min, 674 worlds, 1 VMs, 2 vCPUs; CPU load average: 0.02, 0.01, 0.01
PCPU USED(%): 0.3 0.1 0.0 0.3 0.2 0.1 0.0 0.0 0.0 0.2 50 50 4.1 0.1 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 AVG: 4.4
PCPU UTIL(%): 0.5 0.1 0.1 0.6 0.2 0.2 0.0 0.2 0.0 0.3 100 100 4.2 0.2 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.2 0.1 0.1 AVG: 8.6
CORE UTIL(%): 0.6 0.7 0.4 0.9 0.3 100 4.3 0.2 0.0 0.1 0.4 0.7 AVG: 9.1
ID GID NAME NWLD %USED %RUN %SYS %WAIT %VMWAIT %RDY %IDLE %OVRLP
96337 148153 vmx 1 0.02 0.01 0.02 61.82 - 37.86 0.00 0.00
96339 148153 NetWorld-VM-96338 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96340 148153 NUMASchedRemapEpochInitial 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96341 148153 vmast.96338 1 0.03 0.05 0.00 99.63 - 0.00 0.00 0.00
96343 148153 vmx-vthread-6 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96344 148153 vmx-mks:Debian86 1 0.00 0.00 0.00 61.55 - 38.13 0.00 0.00
96345 148153 vmx-svga:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96346 148153 vmx-vcpu-0:Debian86 1 62.35 99.68 0.00 0.00 0.00 0.00 0.00 0.05
96348 148153 vmx-vcpu-1:Debian86 1 62.36 99.67 0.00 0.00 0.00 0.01 0.00 0.05
96347 148153 PVSCSI-96338:0 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
96350 148153 vmx-vthread-7:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 21
?
CPU Usage Accounting
What states are there
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 22
CPU Usage Accounting
What states are there
Not Running
Running
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 23
CPU Usage Accounting
What states are there
Idle
(descheduled)
Running Ready
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 24
CPU Usage Accounting
In an ideal world
Idle
(descheduled)
Running
Ready
Usage
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 25
CPU Usage Accounting
What is charged against the VM
Idle
(descheduled)
Running
Ready
Usage Overlap HT busy Frequency ..
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 26
CPU Usage Accounting
What is charged against the VM
Idle
(descheduled)
Running
Ready
Usage Overlap HT busy Frequency ..
“stolen time”
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 27
CPU Usage Accounting
What is charged against the VM
Idle
(descheduled)
Running
Ready
Usage Overlap HT busy Frequency ..
“stolen time”
s
y
s
V
m
w
a
I
t
wait
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 28
CPU Usage Accounting
What is charged against the VM
Idle
(descheduled)
Running
Ready
Usage Overlap HT busy Frequency ..
“stolen time”
s
y
s
V
m
w
a
I
t
wait
C
S
T
P
R
D
Y
M
L
M
T
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 29
%LAT_C captures the gap between “ideal” execution (demand) and “current” execution.
• “Ideal”: unlimited dedicated cores running at nominal processor frequency
stolen time aka “%LAT_C”
CPU Usage Accounting
Ideal Current
Demand
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 30
%LAT_C captures the gap between “ideal” execution (demand) and “current” execution.
• “Ideal”: unlimited dedicated cores running at nominal processor frequency
stolen time aka “%LAT_C”
CPU Usage Accounting
Ideal Current
%LAT_C
Demand
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 31
%LAT_C captures the gap between “ideal” execution (demand) and “current” execution.
• “Ideal”: unlimited dedicated cores running at nominal processor frequency
stolen time aka “%LAT_C”
CPU Usage Accounting
Ideal Current
%LAT_C
Sources of Compute Latency:
• VM resource contention: check %RDY and %CSTP
• Power management (P-State): frequency throttling
• Hardware contention: HTs are in use
Demand
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 32
Does enabling HT “spawn” a less capable “logical core”?
Intel® Hyper-Threading Technology
Cores and Threads
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 33
Does enabling HT “spawn” a less capable “logical core”?
Intel® Hyper-Threading Technology
Cores and Threads
“physical” core
“logical”
core
“physical” core
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 34
Does enabling HT “spawn” a less capable “logical core”?
Intel® Hyper-Threading Technology
Cores and Threads
“physical” core
“logical”
core
“physical” core
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 35
Does enabling HT “spawn” a less capable “logical core”?
Maybe two slightly less capable “logical” cores?
Intel® Hyper-Threading Technology
Cores and Threads
“physical” core
“logical”
core
“physical” core
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 36
Does enabling HT “spawn” a less capable “logical core”?
Maybe two slightly less capable “logical” cores?
Intel® Hyper-Threading Technology
Cores and Threads
“physical” core
“logical”
core
“physical” core
“physical” core
“logical”
core0
“logical”
core1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 37
Does enabling HT “spawn” a less capable “logical core”?
Maybe two slightly less capable “logical” cores?
Intel® Hyper-Threading Technology
Cores and Threads
“physical” core
“logical”
core
“physical” core
“physical” core
“logical”
core0
“logical”
core1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 38
Does enabling HT “spawn” a less capable “logical core”?
Maybe two slightly less capable “logical” cores?
Intel® Hyper-Threading Technology
Cores and Threads
“physical” core
“logical”
core
“physical” core
“physical” core
“logical”
core0
“logical”
core1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 39
Intel® Hyper-Threading Technology
Individual throughput reduction, aggregated throughput increase at high load
100
100
~125
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 40
Intel® Hyper-Threading Technology on ESXi
Throughput reduction is accounted for in USED
100 100
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 41
Intel® Hyper-Threading Technology on ESXi
Throughput reduction is accounted for in USED
100 100
125
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 42
Intel® Hyper-Threading Technology on ESXi
Throughput reduction is accounted for in USED
100 100
125
2 x 50 + 12.5 = 62.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 43
Intel® Hyper-Threading Technology on ESXi
Throughput reduction is accounted for in USED
100 100
125
HTEfficiencyShift – Default: 2
HT is:
1: 50 %
2: 25 %
3: 12.5 %
4: 6.25 %
5: 3.125 %
more efficient than no-HT
2 x 50 + 12.5 = 62.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 44
CPU Usage Accounting
Usage vs. Utilization
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 45
Umbrella Term
Power Management
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 46
Umbrella Term
Power Management
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 47
Umbrella Term
Power Management
P-States
Options aka: Power Regulator, CPU Power Management, EIST
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 48
Umbrella Term
Power Management
P-States
Deep C-States
Options aka: Power Regulator, CPU Power Management, EIST
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 49
Power Management refresher …
P-State = voltage / frequency point
C-State = idle state, running or varying degrees of stuff turned off
P2
P1
/ NF
P0
/ TB
Frequency
C0 C1-Cn
P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 50
C-State Transition
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 51
C1
C1
C1
C1
C-State Transition
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 52
C1
C1
C1
C1
C-State Transition
~1µs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 53
Deep C-State Transition
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 54
Deep C-State Transition
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 55
C6
C6
C6
C6
Deep C-State Transition
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 56
C6
C6
C6
C6
Deep C-State Transition
~30µs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 57
Dell
Power Management _Profiles_
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 58
ESXi Power Management Policy
Only affects what’s presented from the BIOS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 59
Who controls what? → allow control /  use
Power Management refresher …
CPU
BIOS
ESXi
VM /
guest
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 60
Who controls what? → allow control /  use
Power Management refresher …
CPU
BIOS
ESXi
VM /
guest
deep C-
States
P-States
HLT / C1-Cn
P-States
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 61
Who controls what? → allow control /  use
Power Management refresher …
CPU
BIOS
ESXi
VM /
guest
HLT / C1
deep C-
States
P-States
HLT / C1-Cn
P-States
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 62
Who controls what? → allow control /  use
Power Management refresher …
CPU
BIOS
ESXi
VM /
guest
HLT / C1
deep C-
States
P-States
HLT / C1-Cn
P-States
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 63
ESXi Power Management Policy
Only affects what’s presented from the BIOS (DELL terminology)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 64
ESXi Power Management Policy
Only affects what’s presented from the BIOS (DELL terminology)
System Profile → "Performance Per Watt (DAPC)"
"Performance Per Watt (OS)"
"Performance"
"Dense Configuration"
"Custom"
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 65
ESXi Power Management Policy
Only affects what’s presented from the BIOS (DELL terminology)
System Profile → "Performance Per Watt (DAPC)"
"Performance Per Watt (OS)"
"Performance"
"Dense Configuration"
"Custom"
CPU Power Management → "System DPBM (DAPC)"
"OS DBPM"
"Maximum Performance“
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 66
ESXi Power Management Policy
Only affects what’s presented from the BIOS (DELL terminology)
System Profile → "Performance Per Watt (DAPC)"
"Performance Per Watt (OS)"
"Performance"
"Dense Configuration"
"Custom"
CPU Power Management → "System DPBM (DAPC)"
"OS DBPM"
"Maximum Performance“
C States → "Enabled"
"Disabled"
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 67
ESXi Power Management Policy
Only affects what’s presented from the BIOS (DELL terminology)
System Profile → "Performance Per Watt (DAPC)"
"Performance Per Watt (OS)"
"Performance"
"Dense Configuration"
"Custom"
CPU Power Management → "System DPBM (DAPC)"
"OS DBPM"
"Maximum Performance“
C States → "Enabled"
"Disabled"
P-States
P-States
P-States
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 68
ESXi Power Management Policy
Only affects what’s presented from the BIOS (DELL terminology)
System Profile → "Performance Per Watt (DAPC)"
"Performance Per Watt (OS)"
"Performance"
"Dense Configuration"
"Custom"
CPU Power Management → "System DPBM (DAPC)"
"OS DBPM"
"Maximum Performance“
C States → "Enabled"
"Disabled"
P-States
P-States
P-States
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 69
ESXi Power Management Policy
Only affects what’s presented from the BIOS (DELL terminology)
System Profile → "Performance Per Watt (DAPC)"
"Performance Per Watt (OS)"
"Performance"
"Dense Configuration"
"Custom"
CPU Power Management → "System DPBM (DAPC)"
"OS DBPM"
"Maximum Performance“
C States → "Enabled"
"Disabled"
P-States
P-States
P-States
C-States
C-States
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 70
Most likely …
Which BIOS policy am I running on?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 71
Most likely “Dynamic”
Most likely …
Which BIOS policy am I running on?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 72
Most likely “Dynamic”
Most likely …
Which BIOS policy am I running on?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 73
Most likely “Dynamic”
Very likely “Performance”
Most likely …
Which BIOS policy am I running on?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 74
Most likely “Dynamic”
Which BIOS policy am I running on?
4:30:58pm up 2 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00
Power Usage: 94W, Power Cap: N/A
PSTATE MHZ:
CPU %USED %UTIL %C0 %C1 %C2 %A/MPERF
0 0.3 0.7 1 23 76 50.0
1 0.0 0.0 0 0 100 50.1
2 0.1 0.2 0 6 94 50.0
3 0.0 0.0 0 0 100 50.1
4 5.2 10.4 10 5 85 50.0
5 0.0 0.0 0 5 95 51.0
6 0.0 0.1 0 3 97 50.0
7 0.0 0.0 0 0 100 50.0
8 0.1 0.4 0 16 84 50.0
9 0.0 0.0 0 0 100 50.0
10 0.0 0.0 0 0 100 50.0
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 75
Most likely “Dynamic”
Which BIOS policy am I running on?
4:30:58pm up 2 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00
Power Usage: 94W, Power Cap: N/A
PSTATE MHZ:
CPU %USED %UTIL %C0 %C1 %C2 %A/MPERF
0 0.3 0.7 1 23 76 50.0
1 0.0 0.0 0 0 100 50.1
2 0.1 0.2 0 6 94 50.0
3 0.0 0.0 0 0 100 50.1
4 5.2 10.4 10 5 85 50.0
5 0.0 0.0 0 5 95 51.0
6 0.0 0.1 0 3 97 50.0
7 0.0 0.0 0 0 100 50.0
8 0.1 0.4 0 16 84 50.0
9 0.0 0.0 0 0 100 50.0
10 0.0 0.0 0 0 100 50.0
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 76
Most likely “Performance”
Which BIOS policy am I running on?
4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00
Power Usage: 142W, Power Cap: N/A
PSTATE MHZ:
CPU %USED %UTIL %C0 %C1 %A/MPERF
0 0.0 0.1 0 100 108.3
1 0.1 0.1 0 100 108.4
2 0.1 0.1 0 100 108.3
3 0.0 0.1 0 100 108.4
4 0.0 0.0 0 100 108.3
5 18.0 16.7 17 83 108.3
6 0.0 0.1 0 100 108.4
7 0.2 0.2 0 100 108.3
8 0.0 0.0 0 100 108.3
9 0.1 0.2 0 100 108.3
10 0.0 0.1 0 100 108.3
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 77
Most likely “Performance”
Which BIOS policy am I running on?
4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00
Power Usage: 142W, Power Cap: N/A
PSTATE MHZ:
CPU %USED %UTIL %C0 %C1 %A/MPERF
0 0.0 0.1 0 100 108.3
1 0.1 0.1 0 100 108.4
2 0.1 0.1 0 100 108.3
3 0.0 0.1 0 100 108.4
4 0.0 0.0 0 100 108.3
5 18.0 16.7 17 83 108.3
6 0.0 0.1 0 100 108.4
7 0.2 0.2 0 100 108.3
8 0.0 0.0 0 100 108.3
9 0.1 0.2 0 100 108.3
10 0.0 0.1 0 100 108.3
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 78
Most likely “Custom”
Which BIOS policy am I running on?
5:09:53pm up 6 min, 827 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.01, 0.01, 0.00
Power Usage: 107W, Power Cap: N/A
PSTATE MHZ: 2401 2400 2300 2200 2100 2000 1900 1800 1700 1600 1500 1400 1300 1200
CPU %USED %UTIL %C0 %C1 %C2 %P0 %P1 %P2 %P3 %P4 %P5 %P6 %P7 %P8 %P9 %P10 %P11 %P12 %P13 %A/MPERF
0 0.2 0.4 0 16 83 62 0 0 0 0 0 0 0 0 0 0 0 0 38 75.2
1 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 59.3
2 0.0 0.1 0 5 95 15 0 0 0 0 0 0 0 0 0 0 0 0 85 57.9
3 0.0 0.0 0 1 98 38 0 0 0 0 0 0 0 0 0 0 0 0 62 61.5
4 0.0 0.0 0 4 96 5 0 0 0 0 0 0 0 0 0 0 0 0 95 52.0
5 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 50.3
6 0.1 0.1 0 1 99 7 0 0 0 0 0 0 0 0 0 0 0 0 93 67.7
7 0.1 0.1 0 0 100 99 0 0 0 0 0 0 0 0 0 0 0 0 1 77.7
8 0.0 0.0 0 0 100 10 0 0 0 0 0 0 0 0 0 0 0 0 90 50.8
9 0.0 0.1 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 51.6
10 0.0 0.0 0 3 97 8 0 0 0 0 0 0 0 0 0 0 0 0 92 54.0
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 79
Most likely “Custom”
Which BIOS policy am I running on?
5:09:53pm up 6 min, 827 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.01, 0.01, 0.00
Power Usage: 107W, Power Cap: N/A
PSTATE MHZ: 2401 2400 2300 2200 2100 2000 1900 1800 1700 1600 1500 1400 1300 1200
CPU %USED %UTIL %C0 %C1 %C2 %P0 %P1 %P2 %P3 %P4 %P5 %P6 %P7 %P8 %P9 %P10 %P11 %P12 %P13 %A/MPERF
0 0.2 0.4 0 16 83 62 0 0 0 0 0 0 0 0 0 0 0 0 38 75.2
1 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 59.3
2 0.0 0.1 0 5 95 15 0 0 0 0 0 0 0 0 0 0 0 0 85 57.9
3 0.0 0.0 0 1 98 38 0 0 0 0 0 0 0 0 0 0 0 0 62 61.5
4 0.0 0.0 0 4 96 5 0 0 0 0 0 0 0 0 0 0 0 0 95 52.0
5 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 50.3
6 0.1 0.1 0 1 99 7 0 0 0 0 0 0 0 0 0 0 0 0 93 67.7
7 0.1 0.1 0 0 100 99 0 0 0 0 0 0 0 0 0 0 0 0 1 77.7
8 0.0 0.0 0 0 100 10 0 0 0 0 0 0 0 0 0 0 0 0 90 50.8
9 0.0 0.1 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 51.6
10 0.0 0.0 0 3 97 8 0 0 0 0 0 0 0 0 0 0 0 0 92 54.0
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 80
The magic of Turbo Boost
Dynamic, supported overclocking
P1
TB1
Frequency
C0
C-State
depth
P1
TB1
C1 C1
C1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 81
The magic of Turbo Boost
Dynamic, supported overclocking
P1
TB1
Frequency
C0
C-State
depth
C6
P1
TB1
C1 C1
C1
P1
TB1
C0
P1
TB1
C6 C6
TB2 TB2
TB3 TB3
TB4 TB4
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 82
The magic of Turbo Boost
Dynamic, supported overclocking
P1
TB1
Frequency
C0
C-State
depth
C6
P1
TB1
C1 C1
C1
P1
TB1
C0
C6 C6
TB2
TB3
TB4
TB5
C6
TB6
TB7
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 83
Power Policy “playfield"
BIOS “Dynamic” pre Haswell
Bad
Good
Optimal*
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 84
Power Policy “playfield"
BIOS “Dynamic” pre Haswell
Bad
Good
Optimal*
BIOS “Dynamic” on Haswell+
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 85
Power Policy “playfield"
BIOS “Dynamic” pre Haswell
BIOS “Maximum / High Performance”
Same* as Custom BIOS + High Performance ESXi policy (with the exception of C1E)
Bad
Good
Optimal*
BIOS “Dynamic” on Haswell+
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 86
Power Policy “playfield"
BIOS “Dynamic” pre Haswell
BIOS “Maximum / High Performance”
Same* as Custom BIOS + High Performance ESXi policy (with the exception of C1E)
Custom BIOS + Custom or Balanced ESXi policy
Bad
Good
Optimal*
* a few workloads fare better with more deterministic performance
BIOS “Dynamic” on Haswell+
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 87
Power Policy “playfield"
Custom done right!
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 88
Power Policy “playfield"
Custom done right!
Custom BIOS
+
ESXi Balanced
“Dynamic”
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 89
Power Policy “playfield"
Custom done right!
Custom BIOS
+
ESXi Balanced
“Dynamic”
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 90
Power Policy “playfield"
Custom done right!
“Performance”
Custom BIOS
+
ESXi Balanced
“Dynamic”
Custom BIOS
+
ESXi Balanced
“Dynamic”
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 91
Power Policy “playfield"
Custom done right!
“Performance”
Custom BIOS
+
ESXi Balanced
“Dynamic”
Custom BIOS
+
ESXi Balanced
“Dynamic”
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 92
“Why doesn’t the frequency I see in Task Manager
change?”
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 93
“Why doesn’t the frequency I see in Task Manager
change?”
• Possibility 1: You are looking at the brand string
• Possibility 2: You are looking in the right place
(but the guest OS has no way of knowing)
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 94
“Why doesn’t the frequency I see in Task Manager
change?”
• Possibility 1: You are looking at the brand string
• Possibility 2: You are looking in the right place
(but the guest OS has no way of knowing)
• Base frequency should be:
CPUID.(EAX=16h):EAX[15-00]
– But it seems Windows is getting that from SMBIOS
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 95
“Why doesn’t the frequency I see in Task Manager
change?”
• Possibility 1: You are looking at the brand string
• Possibility 2: You are looking in the right place
(but the guest OS has no way of knowing)
• Base frequency should be:
CPUID.(EAX=16h):EAX[15-00]
– But it seems Windows is getting that from SMBIOS
Frequently Asked Questions
Power Management Trivia
# grep cpuid ./WinTest.vmx
cpuid.16.eax = "----------------0100011100011000"
cpuid.coresPerSocket = "6"
cpuid.brandstring = "VMware (R) SuperSecretCPU (R) @ 18.2 GHz"
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 96
“I turned off all C-States, why is it still showing C1 in esxtop?”
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 97
“I turned off all C-States, why is it still showing C1 in esxtop?”
Frequently Asked Questions
Power Management Trivia
4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00
Power Usage: 142W, Power Cap: N/A
PSTATE MHZ:
CPU %USED %UTIL %C0 %C1 %A/MPERF
0 0.0 0.1 0 100 108.3
1 0.1 0.1 0 100 108.4
2 0.1 0.1 0 100 108.3
3 0.0 0.1 0 100 108.4
4 0.0 0.0 0 100 108.3
5 18.0 16.7 17 83 108.3
6 0.0 0.1 0 100 108.4
7 0.2 0.2 0 100 108.3
8 0.0 0.0 0 100 108.3
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 98
“I turned off all C-States, why is it still showing C1 in esxtop?”
• You can’t turn off C1, you can disable different levels of deep C-States (C2+)
Frequently Asked Questions
Power Management Trivia
4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00
Power Usage: 142W, Power Cap: N/A
PSTATE MHZ:
CPU %USED %UTIL %C0 %C1 %A/MPERF
0 0.0 0.1 0 100 108.3
1 0.1 0.1 0 100 108.4
2 0.1 0.1 0 100 108.3
3 0.0 0.1 0 100 108.4
4 0.0 0.0 0 100 108.3
5 18.0 16.7 17 83 108.3
6 0.0 0.1 0 100 108.4
7 0.2 0.2 0 100 108.3
8 0.0 0.0 0 100 108.3
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 99
“I won’t have any issues if I have everything set to High Performance in the BIOS, right?”
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 100
“I won’t have any issues if I have everything set to High Performance in the BIOS, right?”
• No, besides possibly:
– PSU redundancy issues
– Power capping
– Temperature
– Firmware bugs
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 101
“I won’t have any issues if I have everything set to High Performance in the BIOS, right?”
• No, besides possibly:
– PSU redundancy issues
– Power capping
– Temperature
– Firmware bugs
• And definitely …
– No ability to control P-/deep C-States
– No maximum Turbo Boost frequencies …
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 102
“I won’t have any issues if I have everything set to High Performance in the BIOS, right?”
• No, besides possibly:
– PSU redundancy issues
– Power capping
– Temperature
– Firmware bugs
• And definitely …
– No ability to control P-/deep C-States
– No maximum Turbo Boost frequencies …
Frequently Asked Questions
Power Management Trivia
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-v3-spec-update.pdf
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 103
“I won’t have any issues if I have everything set to High Performance in the BIOS, right?”
• No, besides possibly:
– PSU redundancy issues
– Power capping
– Temperature
– Firmware bugs
• And definitely …
– No ability to control P-/deep C-States
– No maximum Turbo Boost frequencies …
Frequently Asked Questions
Power Management Trivia
http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-v3-spec-update.pdf
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 104
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 105
“I can clearly see C2 in perfmon on Windows,
why are you lying to me?”
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 106
“I can clearly see C2 in perfmon on Windows,
why are you lying to me?”
• This is either a perfmon bug or a choice to
represent
an “enlightened” idle feature
– “Intelligent Timer Tick Distribution (ITTD)”
– needs Windows 2012 R2 / vHW 11
– disable via “monitor.disable_guest_idle_msr = true”
• you really shouldn’t have to ever …
Frequently Asked Questions
Power Management Trivia
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 107
What runs where and when
The high level picture
CPU
VMK VMM
OS / APPs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 108
What runs where and when
Mostly Direct Exec
CPU
OS / APPs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 109
What runs where and when
Mostly Direct Exec
PCPU
vCPU
(…)
0xffffffff810a99d0 <+416>: test %eax,%eax
0xffffffff810a99d2 <+418>: je 0xffffffff810a9932 <cpu_startup_entry+258>
0xffffffff810a99d8 <+424>: callq 0xffffffff810c6ed0 <rcu_irq_enter>
0xffffffff810a99dd <+429>: mov 0x82740c(%rip),%r13
0xffffffff810a99e4 <+436>: test %r13,%r13
0xffffffff810a99e7 <+439>: je 0xffffffff810a9a07 <cpu_startup_entry+471>
0xffffffff810a99e9 <+441>: mov 0x0(%r13),%rax
0xffffffff810a99ed <+445>: no0xffffffff810a99f0 <+448>: mov 0x8(%r13),%rdi
0xffffffff810a99f4 <+452>: add $0x10,%r13
0xffffffff810a99f8 <+456>: xor %esi,%esi
0xffffffff810a99fa <+458>: mov %ebp,%edx
0xffffffff810a99fc <+460>: callq *%rax
0xffffffff810a99fe <+462>: mov 0x0(%r13),%rax
0xffffffff810a9a02 <+466>: test %rax,%rax
0xffffffff810a9a05 <+469>: jne 0xffffffff810a99f0 <cpu_startup_entry+448>
0xffffffff810a9a07 <+471>: callq 0xffffffff810c6e40 <rcu_irq_exit>
0xffffffff810a9a0c <+476>: jmpq 0xffffffff810a9932 <cpu_startup_entry+258>
0xffffffff810a9a11 <+481>: nopl 0x0(%rax)
0xffffffff810a9a18 <+488>: mov %gs:0xa0e4,%eax
0xffffffff810a9a20 <+496>: mov %eax,%eax
0xffffffff810a9a22 <+498>: bt %rax,(%rbx)
(…)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 110
What runs where and when
What about Idle?
CPU
vCPU
(…)
0xffffffff81052c20 <+0>: sti
0xffffffff81052c21 <+1>: hlt
*loud screeching sound*
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 111
What runs where and when
VMM traps on the privileged instruction and puts (with VMK) the vCPU to “sleep
CPU
VMM
(…)
0xffffffff81052c20 <+0>: sti
0xffffffff81052c21 <+1>: hlt
*tells VMK to deschedule*
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 112
What runs where and when
The scheduler decides what next to run
CPU
VMK
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 113
What runs where and when
E.g. a vCPU / world that is ready to run
CPU
other vCPU
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 114
What runs where and when
ESXi’s _own_ idle thread
CPU
C1-Cn
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 115
Manage host physical memory to abstract physical memory away from guest.
Allow memory over-commitment to provide an illusion of virtual DRAM to the guest.
Hide transient host memory pressure from application
Memory Management Overview
Goals and Objectives
Host Physical Memory Guest Memory
ESXi
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 116
Virtual Memory
Process 0
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 117
Virtual Memory
Process 0
Process 1
Process 2
Process 3
Process n
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 118
Virtual Memory
From the process’ point of
view, it provides:
• Contiguous address space
• Isolation / Security
Process 0
Process 1
Process 2
Process 3
Process n
256 TB
256 TB
256 TB
256 TB
256 TB
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 119
Virtual Memory
From the process’ point of
view, it provides:
• Contiguous address space
• Isolation / Security
Virtual Memory abstracts
Process 0
Process 1
Process 2
Process 3
Process n
Magic
256 TB
256 TB
256 TB
256 TB
256 TB
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 120
Virtual Memory
From the process’ point of
view, it provides:
• Contiguous address space
• Isolation / Security
Virtual Memory abstracts
• It provides the possibility to
overcommit …
Process 0
Process 1
Process 2
Process 3
Process n
Magic
256 TB
256 TB
256 TB
256 TB
256 TB
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 121
Virtual Memory
From the process’ point of
view, it provides:
• Contiguous address space
• Isolation / Security
Virtual Memory abstracts
• It provides the possibility to
overcommit …
The process is unaware what
is backing the virtual address
• Physical Memory
• Swap File
Process 0
Process 1
Process 2
Process 3
Process n
Magic
256 TB
256 TB
256 TB
256 TB
256 TB
64 TB
256 TB
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 122
Virtual Physical Memory
VM 0
Abstraction …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 123
Virtual Physical Memory
VM 0
VM 1
VM 2
VM 3
VM n
Abstraction …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 124
Virtual Physical Memory
From the VMs point of view,
it provides:
• Contiguous address space
• Isolation / Security
VM 0
VM 1
VM 2
VM 3
VM n
6 TB
6 TB
6 TB
6 TB
6 TB
Abstraction …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 125
Virtual Physical Memory
From the VMs point of view,
it provides:
• Contiguous address space
• Isolation / Security
Virt. Physical Mem. abstracts
VM 0
VM 1
VM 2
VM 3
VM n
Magic
6 TB
6 TB
6 TB
6 TB
6 TB
Abstraction …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 126
Virtual Physical Memory
From the VMs point of view,
it provides:
• Contiguous address space
• Isolation / Security
Virt. Physical Mem. abstracts
• It provides the possibility to
overcommit …
VM 0
VM 1
VM 2
VM 3
VM n
Magic
6 TB
6 TB
6 TB
6 TB
6 TB
Abstraction …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 127
Virtual Physical Memory
From the VMs point of view,
it provides:
• Contiguous address space
• Isolation / Security
Virt. Physical Mem. abstracts
• It provides the possibility to
overcommit …
The VM is unaware what is
backing the physical address
• Physical Memory
• Swap File
VM 0
VM 1
VM 2
VM 3
VM n
Magic
6 TB
6 TB
6 TB
6 TB
6 TB
16 TB
*** TB
Abstraction …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 128
Virtual Physical Memory
From the VMs point of view,
it provides:
• Contiguous address space
• Isolation / Security
Virt. Physical Mem. abstracts
• It provides the possibility to
overcommit …
The VM is unaware what is
backing the physical address
• Physical Memory
• Swap File
• Or COW, ZIP, BLN
VM 0
VM 1
VM 2
VM 3
VM n
Magic
6 TB
6 TB
6 TB
6 TB
6 TB
16 TB
*** TB
*** TB
Abstraction …
*** TB
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 129
Virtual Physical Memory
From the VMs point of view,
it provides:
• Contiguous address space
• Isolation / Security
Virt. Physical Mem. abstracts
• It provides the possibility to
overcommit …
The VM is unaware what is
backing the physical address
• Physical Memory
• Swap File
• Or COW, ZIP, BLN
VM 0
VM 1
VM 2
VM 3
VM n
Magic
6 TB
6 TB
6 TB
6 TB
6 TB
16 TB
*** TB
*** TB
Abstraction …
*** TB
*** TB
*
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 130
Understanding VM memory usage on ESXi
Memory Management Overview
How to Hide Memory Pressure?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 131
Understanding VM memory usage on ESXi
Memory Management Overview
How to Hide Memory Pressure?
Total Memory Size
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 132
Understanding VM memory usage on ESXi
Memory Management Overview
How to Hide Memory Pressure?
Total Memory Size
Allocated Memory
Free Memory
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 133
Understanding VM memory usage on ESXi
Memory Management Overview
How to Hide Memory Pressure?
Total Memory Size
Allocated Memory
Free Memory
Active Memory
Idle Memory
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 134
Understanding VM memory usage on ESXi
Reclaim memory from VM if it using more than it is entitled.
• Entitlement depends on configuration (reservation / shares / limit).
• Techniques to reclaim memory from VMs includes:
– Page sharing > Ballooning > Compression > Host swapping
– Breaks host large pages
Memory Management Overview
How to Hide Memory Pressure?
Total Memory Size
Allocated Memory
Free Memory
Active Memory
Idle Memory
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 135
Active Memory
Not the same as guest stats!
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 136
Active Memory
Not the same as guest stats!
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 137
Active Memory
Not the same as guest stats!
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 138
Active Memory
Not the same as guest stats!
!=
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 139
Active Memory
ESXi VM level heuristic
• Weighted, moving average
• OS / VMTools independent
• “Memory Sampling”
aka Touched
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 140
Active Memory
ESXi VM level heuristic
• Weighted, moving average
• OS / VMTools independent
• “Memory Sampling”
Un-maps 100 random pages
over the entire VMs mapped
address space
aka Touched
VM mapped memory
4 KB
100 x
4 KB 4 KB 4 KB 4 KB 4 KB 4 KB …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 141
Active Memory
ESXi VM level heuristic
• Weighted, moving average
• OS / VMTools independent
• “Memory Sampling”
Un-maps 100 random pages
over the entire VMs mapped
address space
Monitors R/W for a minute
(access traps to the VMM)
aka Touched
VM mapped memory
4 KB
100 x
4 KB 4 KB 4 KB 4 KB 4 KB 4 KB …
/ min
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 142
Active Memory
ESXi VM level heuristic
• Weighted, moving average
• OS / VMTools independent
• “Memory Sampling”
Un-maps 100 random pages
over the entire VMs mapped
address space
Monitors R/W for a minute
(access traps to the VMM)
aka Touched
VM mapped memory
4 KB
100 x
4 KB 4 KB 4 KB 4 KB 4 KB 4 KB …
/ min
Read
Read Write
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 143
Active Memory
ESXi VM level heuristic
• Weighted, moving average
• OS / VMTools independent
• “Memory Sampling”
Un-maps 100 random pages
over the entire VMs mapped
address space
Monitors R/W for a minute
(access traps to the VMM)
After one minute, re-maps all
remaining pages, starts again
aka Touched
VM mapped memory
4 KB
100 x
4 KB 4 KB 4 KB 4 KB 4 KB 4 KB …
/ min
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 144
Active Memory
vs. Consumed
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 145
Active Memory
What to trust?
consumed
active
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 146
Active Memory
What to trust?
consumed
active
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 147
Active Memory
What to trust?
consumed
active
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 148
Active Memory
What to trust?
consumed
active
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 149
Active Memory
What to trust?
consumed
active
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 150
Active Memory
What to trust?
consumed
active
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 151
Active Memory
What to trust?
active consumed
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 152
Active Memory
What to trust?
active consumed
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 153
Guest Memory Metrics
In a nutshell
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 154
Guest Memory Metrics
In a nutshell
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 155
Guest Memory Metrics
In a nutshell
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 156
Guest Memory Metrics
In a nutshell
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 157
Guest Memory Metrics
In a nutshell
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 158
Active Memory
Guests working set tends to be between active and consumed
consumed
active guest WS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 159
Active Memory
Guest WS might over report (greedy app)
active guest WS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 160
Active Memory
But guest WS will not underreport
consumed
active
guest WS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 161
Active Memory
Not then end all of guest workload estimation
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 162
Hierarchical Resource Groups
From an ESXi perspective
host The host owns all resources
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 163
Hierarchical Resource Groups
From an ESXi perspective
host
system vim iofilters user
The host owns all resources
Those are distributed by
hierarchical resource groups
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 164
Hierarchical Resource Groups
From an ESXi perspective
host
system vim iofilters user
The host owns all resources
Those are distributed by
hierarchical resource groups
minfree kernel helper ft drivers vmotion …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 165
Hierarchical Resource Groups
From an ESXi perspective
host
system vim iofilters user
The host owns all resources
Those are distributed by
hierarchical resource groups
minfree kernel helper ft drivers vmotion …
vmkboot CpuSched Init …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 166
Hierarchical Resource Groups
From an ESXi perspective
host
system vim iofilters user
The host owns all resources
Those are distributed by
hierarchical resource groups
Consumers can demand
(request) resources
minfree kernel helper ft drivers vmotion …
vmkboot CpuSched Init …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 167
Hierarchical Resource Groups
From an ESXi perspective
host
system vim iofilters user
vCenter shows the sum of all
user resources as:
Total Reservation Capacity
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 168
Hierarchical Resource Groups
From an ESXi perspective
host
system vim iofilters user
vCenter shows the sum of all
user resources as:
Total Reservation Capacity
Global Resource Pools are
then distributed back to
hosts into Local RPs
• Based on VMs demand
…
pool4
pool3
pool2
pool1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 169
Hierarchical Resource Groups
From an ESXi perspective
host
system vim iofilters user
vCenter shows the sum of all
user resources as:
Total Reservation Capacity
Global Resource Pools are
then distributed back to
hosts into Local RPs
• Based on VMs demand
…
vm.vmid
vm.vmid
vm.vmid
…
pool4
pool3
pool2
pool1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 170
Hierarchical Resource Groups
From an ESXi perspective
user Local Resource Groups are
created and incrementally
numbered when clients are
instantiated:
• VM starts / vMotions etc.
• Based on VMs demand
…
vm.vmid
vm.vmid
…
pool430
pool231
pool15
pool1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 171
Hierarchical Resource Groups
From an ESXi perspective
user Local Resource Groups are
created and incrementally
numbered when clients are
instantiated:
• VM starts / vMotions etc.
• Based on VMs demand
The local hierarchy is equal
to the global one
• Check for VM / LRG siblings
…
vm.vmid
vm.vmid
…
pool430
pool231
pool15
pool1
vm.vmid pool321
vm.vmid vm.vmid …
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 172
Hierarchical Resource Groups
From an ESXi perspective
user Local Resource Groups are
created and incrementally
numbered when clients are
instantiated:
• VM starts / vMotions etc.
• Based on VMs demand
The local hierarchy is equal
to the global one
• Check for VM / LRG siblings
VM groups have multiple leaf
consumers
• vmid is local, not global
…
vm.vmid
vm.vmid
…
pool430
pool231
pool15
pool1
vm.vmid pool321
vm.vmid vm.vmid …
vmm uw ...
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 173
cpu.resv Reservation
cpu.limit Limit
cpu.shares Shares
cpu.resvLimit Expandable*
mem.resv Reservation
mem.limit Limit
mem.shares Shares
mem.resvLimit Expandable*
Memory
CPU
Hierarchical Resource Groups
Both Memory and CPU resources
host
system vim iofilters user
…
vm.vmid
vm.vmid
vm.vmid
…
pool4
pool3
pool2
pool1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 174
ESXi CLI (via SSH)
… for CPU … for Memory … for comparison
Tools
sched-stats memstats esxtop
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 175
Tools
cmdline for local groups (no VMs)
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 176
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 177
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
|| $2 ~ /^(vm.|pool)[0-9]+/
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 178
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
|| $2 ~ /^(vm.|pool)[0-9]+/
|| /^ +[0-4] /
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 179
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
|| $2 ~ /^(vm.|pool)[0-9]+/
|| /^ +[0-4] /
{printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn"
,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}'
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 180
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
|| $2 ~ /^(vm.|pool)[0-9]+/
|| /^ +[0-4] /
{printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn"
,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}'
vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz
0 host 0 933 1600 1600 1600 pct 4096000 5232 33168
1 system 0 659 10 -1 -1 pct 500 288 33168
2 vim 0 271 4944 -1 -1 mhz 500 4344 33768
3 iofilters 0 3 0 -1 -1 pct 1000 0 33168
4 user 0 0 0 -1 -1 pct 9000 0 33168
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 181
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
|| $2 ~ /^(vm.|pool)[0-9]+/
|| /^ +[0-4] /
{printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn"
,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}'
vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz
0 host 0 933 1600 1600 1600 pct 4096000 5232 33168
1 system 0 659 10 -1 -1 pct 500 288 33168
2 vim 0 271 4944 -1 -1 mhz 500 4344 33768
3 iofilters 0 3 0 -1 -1 pct 1000 0 33168
4 user 0 0 0 -1 -1 pct 9000 0 33168
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 182
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
|| $2 ~ /^(vm.|pool)[0-9]+/
|| /^ +[0-4] /
{printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn"
,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}'
vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz
0 host 0 933 1600 1600 1600 pct 4096000 5232 33168
1 system 0 659 10 -1 -1 pct 500 288 33168
2 vim 0 271 4944 -1 -1 mhz 500 4344 33768
3 iofilters 0 3 0 -1 -1 pct 1000 0 33168
4 user 0 0 0 -1 -1 pct 9000 0 33168
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 183
Tools
cmdline for local groups (no VMs)
# sched-stats -t groups | awk 'NR == 1
|| $2 ~ /^(vm.|pool)[0-9]+/
|| /^ +[0-4] /
{printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn"
,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}'
vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz
0 host 0 933 1600 1600 1600 pct 4096000 5232 33168
1 system 0 659 10 -1 -1 pct 500 288 33168
2 vim 0 271 4944 -1 -1 mhz 500 4344 33768
3 iofilters 0 3 0 -1 -1 pct 1000 0 33168
4 user 0 0 0 -1 -1 pct 9000 0 33168
sched-stats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 184
Tools
cmdline for local groups (with VMs)
# memstats -r group-stats
-g0 -l2
-s gid:name:min:max::conResv:availResv
-u mb
| sed -n '/^-+/,/.*n/p'
---------------------------------------------------------------------------------
gid name min max conResv availResv
---------------------------------------------------------------------------------
0 host 97823 97823 28917 68907
1 system 20024 -1 20008 68923
2 vim 0 -1 3378 68907
3 iofilters 0 -1 25 68907
4 user 0 -1 5490 68907
---------------------------------------------------------------------------------
memstats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 185
Tools
cmdline for local groups (with VMs)
# memstats -r group-stats
-g0 -l2
-s gid:name:min:max::conResv:availResv
-u mb
| sed -n '/^-+/,/.*n/p'
---------------------------------------------------------------------------------
gid name min max conResv availResv
---------------------------------------------------------------------------------
0 host 97823 97823 28917 68907
1 system 20024 -1 20008 68923
2 vim 0 -1 3378 68907
3 iofilters 0 -1 25 68907
4 user 0 -1 5490 68907
---------------------------------------------------------------------------------
memstats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 186
Tools
cmdline for local groups (with VMs)
# memstats -r group-stats
-g0 -l2
-s gid:name:min:max::conResv:availResv
-u mb
| sed -n '/^-+/,/.*n/p'
---------------------------------------------------------------------------------
gid name min max conResv availResv
---------------------------------------------------------------------------------
0 host 97823 97823 28917 68907
1 system 20024 -1 20008 68923
2 vim 0 -1 3378 68907
3 iofilters 0 -1 25 68907
4 user 0 -1 5490 68907
---------------------------------------------------------------------------------
memstats
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 187
(N)UMA
+ terminology
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 188
DIMMs
(N)UMA
+ terminology
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 189
DIMMs
Socket / Package
(N)UMA
+ terminology
0
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 190
DIMMs
Socket / Package
NUMA node
(N)UMA
+ terminology
0
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 191
DIMMs
Socket / Package
NUMA node
(N)UMA
+ terminology
0
1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 192
DIMMs
Socket / Package
NUMA node
Socket != NUMA node
(N)UMA
+ terminology
0
2
1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 193
DIMMs
Socket / Package
NUMA node
Socket != NUMA node
(N)UMA
+ terminology
0
2
1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 194
DIMMs
Socket / Package
NUMA node
Socket != NUMA node
LLC / DIE
(N)UMA
+ terminology
0
2
1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 195
DIMMs
Socket / Package
NUMA node
Socket != NUMA node
LLC / DIE
(CoD, SNC / Zen1/2)
(N)UMA
+ terminology
0
2
1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 196
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 197
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 198
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 199
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
=
register / 1 cycle
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 200
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
=
register / 1 cycle
this room
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 201
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
=
register / 1 cycle
this room
=
L1-L2 / 10 cycles
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 202
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
=
register / 1 cycle
this room
=
L1-L2 / 10 cycles
this building
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 203
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
=
register / 1 cycle
this room
=
L1-L2 / 10 cycles
this building
=
DRAM / 100 cycles
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 204
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
=
register / 1 cycle
this room
=
L1-L2 / 10 cycles
this building
=
DRAM / 100 cycles
Finland + Algeria
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 205
You want to calculate a + b and the operands are in:
Importance of Memory Access Latency
Jim Gray’s Storage Latency Analogy (slightly adapted)
your head
=
register / 1 cycle
this room
=
L1-L2 / 10 cycles
this building
=
DRAM / 100 cycles
Finland + Algeria
=
Disk / 10^6 cycles
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 206
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 207
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 208
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 209
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
access size cycles ns
L3 / Last Level Cache
core
0
core
1
core
2
core
3
L1 L1 L1 L1
L2 L2 L2 L2
IMC QPI
DRAM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 210
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
access size cycles ns
L1 32 KB 4-5 1.5
L3 / Last Level Cache
core
0
core
1
core
2
core
3
L1 L1 L1 L1
L2 L2 L2 L2
IMC QPI
DRAM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 211
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
access size cycles ns
L1 32 KB 4-5 1.5
L2 256 KB 12 4
L3 / Last Level Cache
core
0
core
1
core
2
core
3
L1 L1 L1 L1
L2 L2 L2 L2
IMC QPI
DRAM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 212
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
access size cycles ns
L1 32 KB 4-5 1.5
L2 256 KB 12 4
L3 8 MB 30 10
L3 / Last Level Cache
core
0
core
1
core
2
core
3
L1 L1 L1 L1
L2 L2 L2 L2
IMC QPI
DRAM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 213
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
access size cycles ns
L1 32 KB 4-5 1.5
L2 256 KB 12 4
L3 8 MB 30 10
L3 / Last Level Cache
core
0
core
1
core
2
core
3
L1 L1 L1 L1
L2 L2 L2 L2
IMC QPI
DRAM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 214
Importance of Memory Access Latency
Numbers based on Intel i7-3770 @ 3.4 GHz
access size cycles ns
L1 32 KB 4-5 1.5
L2 256 KB 12 4
L3 8 MB 30 10
DRAM GBs 30+ 66*
L3 / Last Level Cache
core
0
core
1
core
2
core
3
L1 L1 L1 L1
L2 L2 L2 L2
IMC QPI
DRAM
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 215
N(UMA)
All sockets share the FSB to
the Northbridge and hence
the bandwidth
• NB also known as “Memory
Controller Hub” or MCH
Uniform memory access
latency between every CPU
and every DIMM
Von Neumann Bottleneck
getting worse with faster
CPUs / more RAM
Pre-Opteron/Nehalem
1 2
NB
0 3
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 216
N(UMA)
All sockets share the FSB to
the Northbridge and hence
the bandwidth
• NB also known as “Memory
Controller Hub” or MCH
Uniform memory access
latency between every CPU
and every DIMM
Von Neumann Bottleneck
getting worse with faster
CPUs / more RAM
Pre-Opteron/Nehalem
1 2
NB
0 3
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 217
N(UMA)
All sockets share the FSB to
the Northbridge and hence
the bandwidth
• NB also known as “Memory
Controller Hub” or MCH
Uniform memory access
latency between every CPU
and every DIMM
Von Neumann Bottleneck
getting worse with faster
CPUs / more RAM
Pre-Opteron/Nehalem
1 2
NB
0 3
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 218
0 1
3 2
NUMA
Every NUMA node has its
own Integrated Memory
Controller (IMC)
• Some AMD’s (Bulldozer and
newer) have two nodes per
socket / package
Remote access has to go
over the interconnect and
remote CPU’s IMC
• This adds additional latency
making local and remote
access Non-Uniform
Post-Opteron/Nehalem
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 219
0 1
3 2
NUMA
Every NUMA node has its
own Integrated Memory
Controller (IMC)
• Some AMD’s (Bulldozer and
newer) have two nodes per
socket / package
Remote access has to go
over the interconnect and
remote CPU’s IMC
• This adds additional latency
making local and remote
access Non-Uniform
Post-Opteron/Nehalem
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 220
0 1
3 2
NUMA
Every NUMA node has its
own Integrated Memory
Controller (IMC)
• Some AMD’s (Bulldozer and
newer) have two nodes per
socket / package
Remote access has to go
over the interconnect and
remote CPU’s IMC
• This adds additional latency
making local and remote
access Non-Uniform
Post-Opteron/Nehalem
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 221
0 1
3 2
NUMA
Every NUMA node has its
own Integrated Memory
Controller (IMC)
• Some AMD’s (Bulldozer and
newer) have two nodes per
socket / package
Remote access has to go
over the interconnect and
remote CPU’s IMC
• This adds additional latency
making local and remote
access Non-Uniform
Post-Opteron/Nehalem
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 222
0 1
3 2
NUMA
2 QPI / IC
CPU
/ns
0 1 2 3
0 72 291 323 294
1 296 72 293 315
2 319 296 71 296
3 290 325 300 71
local adjacent “routed”
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 223
CPU
/ns
0 1 2 3
0 136 194 198 201
1 194 135 194 196
2 201 194 135 200
3 202 197 198 135
0 1
3 2
NUMA
3 QPI / IC
local adjacent
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 224
0 1
3 2
NUMA
Basic Migration Types
NUMA clients (vCPUs +
memory) are kept local to a
home node
Balance migrations re-assign
the home node, memory
follows vCPUs!
Locality migrations set home
node to where the most
memory resides
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 225
0 1
3 2
NUMA
Basic Migration Types
NUMA clients (vCPUs +
memory) are kept local to a
home node
Balance migrations re-assign
the home node, memory
follows vCPUs!
Locality migrations set home
node to where the most
memory resides
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 226
0 1
3 2
NUMA
Basic Migration Types
NUMA clients (vCPUs +
memory) are kept local to a
home node
Balance migrations re-assign
the home node, memory
follows vCPUs!
Locality migrations set home
node to where the most
memory resides
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 227
0 1
3 2
NUMA
Basic Migration Types
NUMA clients (vCPUs +
memory) are kept local to a
home node
Balance migrations re-assign
the home node, memory
follows vCPUs!
Locality migrations set home
node to where the most
memory resides
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 228
0 1
3 2
NUMA
Basic Migration Types
NUMA clients (vCPUs +
memory) are kept local to a
home node
Balance migrations re-assign
the home node, memory
follows vCPUs!
Locality migrations set home
node to where the most
memory resides
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 229
0 1
3 2
NUMA
Basic Migration Types
NUMA clients (vCPUs +
memory) are kept local to a
home node
Balance migrations re-assign
the home node, memory
follows vCPUs!
Locality migrations set home
node to where the most
memory resides
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 230
0 1
3 2
NUMA
Basic Migration Types
NUMA clients (vCPUs +
memory) are kept local to a
home node
Balance migrations re-assign
the home node, memory
follows vCPUs!
Locality migrations set home
node to where the most
memory resides
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 231
NUMA migration incurs significant cost.
• All pages need to be remapped, i.e. %localMemory initially drops to 0% and slowly recovers.
• Copying memory pages across NUMA boundaries cost memory bandwidth.
NUMA Scheduler Consideration
Local Contention vs Remote Access
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 232
NUMA migration incurs significant cost.
• All pages need to be remapped, i.e. %localMemory initially drops to 0% and slowly recovers.
• Copying memory pages across NUMA boundaries cost memory bandwidth.
NUMA Scheduler Consideration
Local Contention vs Remote Access
0
10
20
30
40
50
60
70
80
90
100
0
1
2
3
4
5
6
7
8
9
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
%Local-Mem
#Migrations
time (30sec)
Memory Locality & NUMA-migrations
(with NUMA Migration)
%local #migrations
0
20
40
60
80
100
120
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
%Local
#Migrations
time (30sec units)
Memory Locality & NUMA-migrations
(No NUMA Migration)
%local #migrations
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 233
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 234
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 235
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
cpuid.coresPerSocket
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 236
CPS in GUI & supported
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
cpuid.coresPerSocket
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 237
Max vSMP 8
CPS in GUI & supported
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
cpuid.coresPerSocket
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 238
Max vSMP 8
CPS in GUI & supported
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
cpuid.coresPerSocket vNUMA
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 239
Max vSMP 32
Max vSMP 8
CPS in GUI & supported
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
cpuid.coresPerSocket vNUMA
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 240
numa.vcpu.min = 9
Max vSMP 32
Max vSMP 8
CPS in GUI & supported
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
cpuid.coresPerSocket vNUMA
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 241
numa.vcpu.min = 9
Max vSMP 32
Max vSMP 8
CPS in GUI & supported
We had good(ish) reasonsos
vNUMA auto-sizing history
(…) 2007 2008 2009 2010 2011 2012 2013 2014 (…)
cpuid.coresPerSocket vNUMA
My starting data @ VMware
ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0
ESX 3.5
cpuid.coresPerSocket → numa.vcpu.maxPerVirtualNode
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 242
VPD doesn’t affect ESXi sched.
PPD does define ESXi NUMA sched.
• AKA NUMA client
Doesn’t influence ESXi sched.
Might influence Guest / App sched.
CPU Topology
vNUMA Topology
Two level’s of abstraction
Virtual and Physical Proximity Domains
VPD
PPD
CPS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 243
VPD doesn’t affect ESXi sched.
PPD does define ESXi NUMA sched.
• AKA NUMA client
Doesn’t influence ESXi sched.
Might influence Guest / App sched.
CPU Topology
vNUMA Topology
Two level’s of abstraction
Virtual and Physical Proximity Domains
VPD
PPD
C
PPD
VPD
C C C C C
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 244
VPD doesn’t affect ESXi sched.
PPD does define ESXi NUMA sched.
• AKA NUMA client
Doesn’t influence ESXi sched.
Might influence Guest / App sched.
CPU Topology
vNUMA Topology
Two level’s of abstraction
Virtual and Physical Proximity Domains
VPD
PPD
CPS
PPD
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 245
VPD doesn’t affect ESXi sched.
PPD does define ESXi NUMA sched.
• AKA NUMA client
Doesn’t influence ESXi sched.
Might influence Guest / App sched.
CPU Topology
vNUMA Topology
Two level’s of abstraction
Virtual and Physical Proximity Domains
VPD
PPD
CPS
PPD
VPD
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 246
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 247
Running Compute Intensive Benchmark
Case Study: Project Pacific
https://blogs.vmware.com/performance/2019/10/how-does-project-pacific-deliver-8-better-
performance-than-bare-metal.html
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 248
Running Compute Intensive Benchmark
Case Study: Project Pacific
43.5% local memory access
on native Linux
https://blogs.vmware.com/performance/2019/10/how-does-project-pacific-deliver-8-better-
performance-than-bare-metal.html
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 249
Running Compute Intensive Benchmark
Case Study: Project Pacific
43.5% local memory access
on native Linux
99.2% local memory
access on Pacific Cluster
https://blogs.vmware.com/performance/2019/10/how-does-project-pacific-deliver-8-better-
performance-than-bare-metal.html
250
DOAG 2020 │ ©2020 VMware, Inc.
IO stuff
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
vSphere 6.0 achieves Line Rate throughput on a 40GigE NIC
Throughput ↑ from 20.5 to 35.5 Gbps
CPU Used ↓ from 36 to 13 % (per Gbps)
Herculean Network IO
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 252
By default, vSphere tunes for lower CPU usage by batching I/O operations
Virtual NIC coalescing - recap
Trading CPU Cycles for Lower Latency
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 253
By default, vSphere tunes for lower CPU usage by batching I/O operations
• By default, that is also the case for the RX and TX path on vNICs (here vmxnet3)
• When disabled:
– Every packet received interrupts immediately
– Every packet will be issued immediately
Virtual NIC coalescing - recap
Trading CPU Cycles for Lower Latency
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 254
By default, vSphere tunes for lower CPU usage by batching I/O operations
• By default, that is also the case for the RX and TX path on vNICs (here vmxnet3)
• When disabled:
– Every packet received interrupts immediately
– Every packet will be issued immediately
Virtual NIC coalescing - recap
Trading CPU Cycles for Lower Latency
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 255
By default, vSphere tunes for lower CPU usage by batching I/O operations
• By default, that is also the case for the RX and TX path on vNICs (here vmxnet3)
• When disabled:
– Every packet received interrupts immediately
– Every packet will be issued immediately
Virtual NIC coalescing - recap
Trading CPU Cycles for Lower Latency
1
1
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 256
By default, vSphere tunes for lower CPU usage by batching I/O operations
• By default, that is also the case for the RX and TX path on vNICs (here vmxnet3)
• When disabled:
– Every packet received interrupts immediately
– Every packet will be issued immediately
Virtual NIC coalescing - recap
Trading CPU Cycles for Lower Latency
1
1
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 257
By default, vSphere tunes for lower CPU usage by batching I/O operations
• By default, that is also the case for the RX and TX path on vNICs (here vmxnet3)
• When disabled:
– Every packet received interrupts immediately
– Every packet will be issued immediately
Virtual NIC coalescing - recap
Trading CPU Cycles for Lower Latency
1 2 3 4 5 6 7 8 9 .. .. ..
1 2 3 4 5 6 7 8 9 .. .. ..
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 258
Possible Latency Optimizations
Network latency optimization on the VM level
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 259
Disable LRO (Large Receive Offload)
• Host wide: “Net.Vmxnet3SwLRO = false”
• Small packets are no longer concatenated into larger ones
Possible Latency Optimizations
Network latency optimization on the VM level
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 260
Disable LRO (Large Receive Offload)
• Host wide: “Net.Vmxnet3SwLRO = false”
• Small packets are no longer concatenated into larger ones
Disable (vNIC) coalescing
• VMX option: “ethernetX.coalescingScheme = disabled”
• Issue TX immediately and immediately interrupt on RX
Possible Latency Optimizations
Network latency optimization on the VM level
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 261
Disable LRO (Large Receive Offload)
• Host wide: “Net.Vmxnet3SwLRO = false”
• Small packets are no longer concatenated into larger ones
Disable (vNIC) coalescing
• VMX option: “ethernetX.coalescingScheme = disabled”
• Issue TX immediately and immediately interrupt on RX
Disable Dynamic queueing
• NetQueue feature, load balances and combines less used queues
• Disabling guarantees a single queue for the VM
Possible Latency Optimizations
Network latency optimization on the VM level
Network
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Network – Recommendations
Use vmxnet3 Guest Network Driver
Very efficient and required for maximum performance=
Evaluate Disabling Interrupt Coalescing
Default mechanism may induce small amounts of latency in favor of throughout
It’s a 10Gb+ World
1Gb saturation is real, more bandwidth required today, especially in light of vSAN, MonsterVM vMotion
Use Latency Sensitivity High ‘Cautiously’
While it can reduce latency and jitter in the 10us use case, it comes at a cost with core reservations, etc
Requires FULL CPU and MEM reservation – or it won’t work and won’t tell you
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Herculean Storage IO
• More than 1 Million IOPs from 1 VM
Hypervisor: vSphere 5.1
Server: HP DL380 Gen8
CPU: 2 x Intel Xeon E5-2690, HT disabled
Memory: 256GB
HBAs: 5 x QLE2562
Storage: 2 x Violin Memory 6616 Flash Arrays
VM: Windows Server 2008 R2, 8 vCPUs and 48GB.
Iometer Config: 4K IO size w/ 16 workers
Reference: http://blogs.vmware.com/performance/2012/08/1millioniops-on-1vm.html
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Bare-metal to virtual TPC-C* gap then and now(ish)
* Non-complaint,
fair-use
implementation of
the workload on
Oracle 12c. Not
comparable to
official results.
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Bare-metal to virtual TPC-C* gap then and now(ish)
* Non-complaint,
fair-use
implementation of
the workload on
Oracle 12c. Not
comparable to
official results.
-
30
%
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Bare-metal to virtual TPC-C* gap then and now(ish)
* Non-complaint,
fair-use
implementation of
the workload on
Oracle 12c. Not
comparable to
official results.
-
30
%
-
10%
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Scaling out vs. up on the same host to amortize overhead
1416.37
0
200
400
600
800
1000
1200
1400
1600
Baremetal tpsE
Throughput
Score
TPC-E on native HP Proliant DL 385 G8
http://blogs.vmware.com/vsphere/2013/09/worlds-first-tpc-vms-benchmark-result.html
http://www.tpc.org/4064 / http://www.tpc.org/5201
470.31
468.11
457.55
0
200
400
600
800
1000
1200
1400
1600
Virtual tpsE of 3 VMs running TPC-VMS
Throughput
Score
TPC-VMS on virtualized HP Proliant DL 385 G8
VM3
VM2
VM1
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Storage I/O latencies are higher in virtual
The Problem - with Database Logs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Storage I/O latencies are higher in virtual
Usually not a noticeable problem for Data IO
• Long (5+ ms) latency on HDDs
• Random I/O, Many threads banging on the same spindle(s)
• Even some SSDs are ~1ms
The Problem - with Database Logs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Storage I/O latencies are higher in virtual
Usually not a noticeable problem for Data IO
• Long (5+ ms) latency on HDDs
• Random I/O, Many threads banging on the same spindle(s)
• Even some SSDs are ~1ms
Not OK for Redo Log access
The Problem - with Database Logs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Storage I/O latencies are higher in virtual
Usually not a noticeable problem for Data IO
• Long (5+ ms) latency on HDDs
• Random I/O, Many threads banging on the same spindle(s)
• Even some SSDs are ~1ms
Not OK for Redo Log access
• Short (<<1ms latency)
• Sequential I/O, Single-threaded, Write-Only
• Typically a write-back cache in the HBA or the array
• Check the Top 5 wait events in Oracle AWR or equivalent database health reports
The Problem - with Database Logs
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
By default, vSphere tunes for lower CPU usage by batching I/O operations
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
By default, vSphere tunes for lower CPU usage by batching I/O operations
But when sensing low IOPS, vSphere stops batching and switches to low latency mode
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
By default, vSphere tunes for lower CPU usage by batching I/O operations
But when sensing low IOPS, vSphere stops batching and switches to low latency mode
• For lowest latency, put the log device on a vSCSI adapter by itself
• Batching and coalescing is on a per-vSCSI bus, not device(!) basis
• Explicit tuning can prove more effective though
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Explicit workaround on the issuing path:
• Default is Asynchronous request passing from vSCSI adapter to VMKernel
– But dynamically adjust for low IOPS case
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Explicit workaround on the issuing path:
• Default is Asynchronous request passing from vSCSI adapter to VMKernel
– But dynamically adjust for low IOPS case
• To explicitly force immediate initiation of I/O operation (sync)
– scsiNNN.reqCallThreshold = “1”
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Explicit workaround on the issuing path:
• Default is Asynchronous request passing from vSCSI adapter to VMKernel
– But dynamically adjust for low IOPS case
• To explicitly force immediate initiation of I/O operation (sync)
– scsiNNN.reqCallThreshold = “1”
Explicit workaround on the completion path:
• Default is coalescing of Virtual Interrupts
– vSphere automatically suspends interrupt coalescing for low IOPS workloads
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Explicit workaround on the issuing path:
• Default is Asynchronous request passing from vSCSI adapter to VMKernel
– But dynamically adjust for low IOPS case
• To explicitly force immediate initiation of I/O operation (sync)
– scsiNNN.reqCallThreshold = “1”
Explicit workaround on the completion path:
• Default is coalescing of Virtual Interrupts
– vSphere automatically suspends interrupt coalescing for low IOPS workloads
• Or explicitly disable Virtual Interrupt Coalescing
– For PVSCSI: scsiNNN.intrCoalescing = “False”
– For other vHBAs: scsiNNN.ic = “False”
The Solution - Trade CPU Cycles for Lower Latency
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
VMFS on par or faster than RDM (approx. 1%)
Reference: http://www.vmware.com/techpapers/2017/sql-server-vsphere65-perf.html
Myth Revisited: RDM versus VMFS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc.
Storage – Recommendations
Use Multiple vSCSI Adapters
Allows for more queues and I/O’s in flight
Use pvscsi vSCSI Adapter
More efficient I/O’s per cycle
Don’t Use RDM’s
Unless needed for shared disk clustering, no longer a performance advantage
VMware Snapshots Should Be ‘Temporary’
Despite constant performance improvements, snapshots should not live forever, Co-Stop, Syncronous
Leverage Your Storage OEM’s Integration Guide
They provide necessary guidance around items like multi-pathing
282
DOAG 2020 │ ©2020 VMware, Inc.
vMotion
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 283
vMotion Workflow
vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 284
vMotion Workflow
Create VM on Destination
1
vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 285
Copy Memory
vMotion Workflow
Create VM on Destination
1
2
vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 286
Quiesce VM on Source
Copy Memory
vMotion Workflow
Create VM on Destination
1
2
3
vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 287
Quiesce VM on Source
Copy Memory
vMotion Workflow
Create VM on Destination
1
2
3
Transfer Device State
4 vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 288
Quiesce VM on Source
Copy Memory
vMotion Workflow
Create VM on Destination
1
2
3
Transfer Device State
Resume VM on Destination
4
5
vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 289
Quiesce VM on Source
Copy Memory
vMotion Workflow
Create VM on Destination
1
2
3
Transfer Device State
Resume VM on Destination
4
5
vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
Execution
Switchover
Time of 1 sec
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 290
Quiesce VM on Source
Copy Memory
vMotion Workflow
Create VM on Destination
1
2
3
Transfer Device State
Resume VM on Destination
Power Off VM on Source
4
5
6
vMotion Network
Datastore
Source
ESXi Host
Destination
ESXi Host
Execution
Switchover
Time of 1 sec
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 291
Memory Copy
Source VM Memory
Destination VM Memory
Phase 0:
Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB
Iterative Memory Pre-Copy
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 292
Memory Copy
Source VM Memory
Destination VM Memory
Phase 0:
Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB
Iterative Memory Pre-Copy
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 293
Memory Copy
Source VM Memory
Destination VM Memory
Phase 0:
Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB
Phase 1:
Retransmit the dirtied 10GB. In the process, the VM dirties another 3GB
Iterative Memory Pre-Copy
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 294
Memory Copy
Source VM Memory
Destination VM Memory
Phase 0:
Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB
Phase 1:
Retransmit the dirtied 10GB. In the process, the VM dirties another 3GB
Phase 2:
Send the 3GB. While that transfer is happening, the VM dirties 1GB
Iterative Memory Pre-Copy
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 295
Memory Copy
Source VM Memory
Destination VM Memory
Phase 0:
Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB
Phase 1:
Retransmit the dirtied 10GB. In the process, the VM dirties another 3GB
Phase 2:
Send the 3GB. While that transfer is happening, the VM dirties 1GB
Phase 3:
Send the remaining 1GB
Iterative Memory Pre-Copy
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 296
vMotion of Oracle RAC
It’s been working for a while …
297
Confidential │ ©2018 VMware, Inc.
pre 6.5*
Trace Cost
LP remap
Prealloced memory
RDTSC cost
(SDPS)
Common Issues for Monster VMs
‹#› 298
Confidential │ ©2018 VMware, Inc.
- use ESXi 6.5
- use multi NIC (10Gb+!)
299
DOAG 2020 │ ©2020 VMware, Inc.
Performance
Troubleshooting
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 300
How to troubleshoot any issue
No matter how complicated
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 301
1. Identify a related system or component
that your team is not responsible for
How to troubleshoot any issue
No matter how complicated
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 302
1. Identify a related system or component
that your team is not responsible for
2. Hypothesize that the issue is with that component
How to troubleshoot any issue
No matter how complicated
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 303
1. Identify a related system or component
that your team is not responsible for
2. Hypothesize that the issue is with that component
3. Assign the issue to the responsible team
How to troubleshoot any issue
No matter how complicated
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 304
1. Identify a related system or component
that your team is not responsible for
2. Hypothesize that the issue is with that component
3. Assign the issue to the responsible team
4. When proven wrong, go to 1.
How to troubleshoot any issue
No matter how complicated
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 305
Tuning guide for a completely different system
Some advanced option found on a blog
Vaguely fitting KB
etc.
Perfectly valid methods to “troubleshoot” or “tune”
/s
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 306
The biggest enemy
"XY Problem"
1. I have problem X
1. I have problem Y
Y
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 307
The biggest enemy
"XY Problem"
1. I have problem X
1. I have problem Y
2. Help me solve problem Y
Y
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 308
The biggest enemy
"XY Problem"
1. I have problem X
1. I have problem Y
2. Help me solve problem Y
3. Hey! I still have a problem
Y
?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 309
The biggest enemy
"XY Problem"
1. I have problem X
2. I think it is because of Y
3. I have problem Y
4. Help me solve problem Y
5. Hey! I still have a problem
Y
?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 310
The biggest enemy
"XY Problem"
1. I have problem X
2. I think it is because of Y
3. I have problem Y
4. Help me solve problem Y
5. Hey! I still have a problem
X
Y
?
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 311
The biggest enemy
"XY Problem"
1. I have problem X
2. I think it is because of Y
3. I have problem Y
4. Help me solve problem Y
5. Hey! I still have a problem
tl;dr
don’t jump to conclusions
X
Y
?
!
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 312
Where to use caution
Believing anybody
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 313
Where to use caution
Believing anybody
“Trust, but verify.“*
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 314
Where to use caution
Believing anybody
* From the Russian proverb:
"Доверяй, но проверяй"
{Doveryai, no proveryai}
“Trust, but verify.“*
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 315
Where to use caution
Comparing hosts, past and present, etc.
!=
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 316
Don’t assume newer == better
Where to use caution
Comparing hosts, past and present, etc.
!=
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 317
Don’t assume newer == better
Identify all differences
Where to use caution
Comparing hosts, past and present, etc.
!=
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 318
Where to use caution
Relying on Traffic Light Dashboards alone
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 319
All metrics green?
Where to use caution
Relying on Traffic Light Dashboards alone
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 320
All metrics green?
→ All good then! (false negative)
Where to use caution
Relying on Traffic Light Dashboards alone
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 321
All metrics green?
→ All good then! (false negative)
Some metrics red?
Where to use caution
Relying on Traffic Light Dashboards alone
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 322
All metrics green?
→ All good then! (false negative)
Some metrics red?
→ Something must be broken! (false positive)
Where to use caution
Relying on Traffic Light Dashboards alone
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 323
Where to use caution
Working through a list of known issues
Very good to start with!
• Don’t spend more than half and hour
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 324
Where to use caution
Working through a list of known issues
Very good to start with!
• Don’t spend more than half and hour
Can be from different perspectives
• Application
• Resources, e.g.:
– CPU contention
– Memory pressure
– Disk latency
– Etc.
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 325
Apply different methodologies as needed
e.g. directionally
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 326
Apply different methodologies as needed
e.g. directionally
Top → Down: drill down from the application / its metrics
• app specific / difficult to "profile" the whole path
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 327
Apply different methodologies as needed
e.g. directionally
Top → Down: drill down from the application / its metrics
• app specific / difficult to "profile" the whole path
Bottom → Up: investigate from the resource point of view
• easy to run into false positives / not all resources evenly covered
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 328
Apply different methodologies as needed
e.g. directionally
Top → Down: drill down from the application / its metrics
• app specific / difficult to "profile" the whole path
Bottom → Up: investigate from the resource point of view
• easy to run into false positives / not all resources evenly covered
Recommendation: Bottom Up Checklist first
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 329
What makes you think there is a performance issue
Has it ever performed well
What has changed since
Can it be quantified
What else is affected
What is the timing
Is it reproducible
etc.
Ask questions
Good ones, preferably
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 330
Take notes along the way
seriously
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 331
Take notes along the way
seriously
"Remember kids, the
only difference between
science and screwing
around is writing it
down."
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 332
Provide an exact timeline
Part of notetaking but often forgotten
2017-11-28
23:00 UTC
Upgrade
2017-11-29
07:00 UTC
Issue first
noticed
2017-11-29
> 23:59 UTC
Tried
everything
under the sun
and wrote
down nothing
2017-11-30
08:00
Called
GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 333
Be accurate and universal
https://xkcd.com/1179/
334
DOAG 2020 │ ©2020 VMware, Inc.
SR examples
“The case of the unexplained …”
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 335
Initial SR description:
• Oracle DB on virtual 64bit W2K8 three times slower than physical
• on 32bit W2K8 and 32/64bit RHEL5, only 5% slower than physical
• benchmarked with production equivalent test script
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 336
Initial SR description:
• Oracle DB on virtual 64bit W2K8 three times slower than physical
• on 32bit W2K8 and 32/64bit RHEL5, only 5% slower than physical
• benchmarked with production equivalent test script
Troubleshooting in support:
• checked logs for errors
• basics like power management, limits, etc
• research if similar issues have been reported
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 337
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 338
Reproducing in-house:
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 339
Reproducing in-house:
• the customer provided two pre-configured VMs
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 340
Reproducing in-house:
• the customer provided two pre-configured VMs
• during initial run, the 64bit VM performed worse by a factor of 3
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 341
Reproducing in-house:
• the customer provided two pre-configured VMs
• during initial run, the 64bit VM performed worse by a factor of 3
• automated benchmark start and result collection, dropped to 1.6 on avg.
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 342
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 343
Example 1 – Oracle DB performance
Tales from GSS
DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 344
Murphy's law strikes:
Example 1 – Oracle DB performance
Tales from GSS
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf
2020-ntn-vsphere_performance_principles_bondzio.pdf

More Related Content

Similar to 2020-ntn-vsphere_performance_principles_bondzio.pdf

SharePoint 2010's Virtual Reality
SharePoint 2010's Virtual RealitySharePoint 2010's Virtual Reality
SharePoint 2010's Virtual RealityMichael Noel
 
Windows 10 Minimum Hardware Specification
Windows 10 Minimum Hardware SpecificationWindows 10 Minimum Hardware Specification
Windows 10 Minimum Hardware SpecificationBenoît Chamontin
 
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringNETWAYS
 
Sprint 138
Sprint 138Sprint 138
Sprint 138ManageIQ
 
Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...
Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...
Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...Salman Shaikh ヅ
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.Michael Noel
 
SharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User GroupSharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User GroupMichael Noel
 
VMworld 2013: Quantifying the Business Value of VMware Horizon View
VMworld 2013: Quantifying the Business Value of VMware Horizon View VMworld 2013: Quantifying the Business Value of VMware Horizon View
VMworld 2013: Quantifying the Business Value of VMware Horizon View VMworld
 
Persistent BIOS Infection
Persistent BIOS InfectionPersistent BIOS Infection
Persistent BIOS Infectionguest042636
 
Persistent Bios Infection
Persistent Bios InfectionPersistent Bios Infection
Persistent Bios Infectionguest042636
 
Optimizing Your z/OS Mainframe Through zIIP Offload and SQL Analysis
Optimizing Your z/OS Mainframe Through zIIP Offload and SQL AnalysisOptimizing Your z/OS Mainframe Through zIIP Offload and SQL Analysis
Optimizing Your z/OS Mainframe Through zIIP Offload and SQL AnalysisPrecisely
 
Virtualization & Network Connectivity
Virtualization & Network Connectivity Virtualization & Network Connectivity
Virtualization & Network Connectivity itplant
 
Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例Kazuhito Ohkawa
 
Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?DataCore Software
 
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesDeep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesAlluxio, Inc.
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVMJohn Lee
 

Similar to 2020-ntn-vsphere_performance_principles_bondzio.pdf (20)

SharePoint 2010's Virtual Reality
SharePoint 2010's Virtual RealitySharePoint 2010's Virtual Reality
SharePoint 2010's Virtual Reality
 
Windows 10 Minimum Hardware Specification
Windows 10 Minimum Hardware SpecificationWindows 10 Minimum Hardware Specification
Windows 10 Minimum Hardware Specification
 
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
 
Sprint 138
Sprint 138Sprint 138
Sprint 138
 
Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...
Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...
Ciscounifiedcomputingsystemucschangingtheeconomicsdatacenter 130514165541-php...
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
 
SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.SharePoint 2010 Virtualization - SharePoint Saturday L.A.
SharePoint 2010 Virtualization - SharePoint Saturday L.A.
 
SharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User GroupSharePoint 2010 Virtualization - Norway SharePoint User Group
SharePoint 2010 Virtualization - Norway SharePoint User Group
 
VMworld 2013: Quantifying the Business Value of VMware Horizon View
VMworld 2013: Quantifying the Business Value of VMware Horizon View VMworld 2013: Quantifying the Business Value of VMware Horizon View
VMworld 2013: Quantifying the Business Value of VMware Horizon View
 
Performance vision Version 2.15 news
Performance vision Version 2.15 newsPerformance vision Version 2.15 news
Performance vision Version 2.15 news
 
Persistent BIOS Infection
Persistent BIOS InfectionPersistent BIOS Infection
Persistent BIOS Infection
 
Persistent Bios Infection
Persistent Bios InfectionPersistent Bios Infection
Persistent Bios Infection
 
Optimizing Your z/OS Mainframe Through zIIP Offload and SQL Analysis
Optimizing Your z/OS Mainframe Through zIIP Offload and SQL AnalysisOptimizing Your z/OS Mainframe Through zIIP Offload and SQL Analysis
Optimizing Your z/OS Mainframe Through zIIP Offload and SQL Analysis
 
Virtualization & Network Connectivity
Virtualization & Network Connectivity Virtualization & Network Connectivity
Virtualization & Network Connectivity
 
Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例Kauli SSPにおけるVyOSの導入事例
Kauli SSPにおけるVyOSの導入事例
 
Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?
 
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesDeep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 

Recently uploaded

Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一3sw2qly1
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Roomishabajaj13
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 

Recently uploaded (20)

Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-
sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-
sasti delhi Call Girls in munirka 🔝 9953056974 🔝 escort Service-
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
定制(CC毕业证书)美国美国社区大学毕业证成绩单原版一比一
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
Call Girls Service Dwarka @9999965857 Delhi 🫦 No Advance VVIP 🍎 SERVICE
Call Girls Service Dwarka @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SERVICECall Girls Service Dwarka @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SERVICE
Call Girls Service Dwarka @9999965857 Delhi 🫦 No Advance VVIP 🍎 SERVICE
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With RoomVIP Kolkata Call Girl Salt Lake 👉 8250192130  Available With Room
VIP Kolkata Call Girl Salt Lake 👉 8250192130 Available With Room
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 

2020-ntn-vsphere_performance_principles_bondzio.pdf

  • 1. DOAG 2020 │ ©2020 VMware, Inc. ESXi Performance Principles DOAG Edition Valentin Bondzio Sr. Staff TSE / GSS Premier Services 2020-01-23
  • 2. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 2 Brief Intro
  • 3. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 3 Brief Intro @VMware since 2009 Global Support Services / Premier Services Focus on Resource Management, Performance and Windows Internals Originally from Berlin, living in Ireland since 2007 And most importantly …
  • 4. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 4 Brief Intro Not an Oracle expert !
  • 5. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 5 Brief Intro Not an Oracle expert !
  • 6. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Agenda 6 CPU Scheduling and Usage Accounting The “basics” “Power Management” The Good, the Better and the Ugly ESXi Memory Management More “basics” Local resource distribution What else is running on ESXi CPU Topology Abstraction CPU Socket != NUMA node
  • 7. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Agenda 7 CPU Scheduling and Usage Accounting The “basics” “Power Management” The Good, the Better and the Ugly ESXi Memory Management More “basics” Local resource distribution What else is running on ESXi CPU Topology Abstraction CPU Socket != NUMA node +I/O stuff
  • 8. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Agenda 8 CPU Scheduling and Usage Accounting The “basics” “Power Management” The Good, the Better and the Ugly ESXi Memory Management More “basics” Local resource distribution What else is running on ESXi CPU Topology Abstraction CPU Socket != NUMA node +I/O stuff +vMotion
  • 9. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Agenda 9 CPU Scheduling and Usage Accounting The “basics” “Power Management” The Good, the Better and the Ugly ESXi Memory Management More “basics” Local resource distribution What else is running on ESXi CPU Topology Abstraction CPU Socket != NUMA node +I/O stuff +vMotion +Backup
  • 10. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 10 Resource guarantees and weighting (shares) on a per VM or “Resource Pool” level CPU Scheduler Overview
  • 11. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 11 Dispatch VMs (its “worlds”) to honor CPU settings (Local) CPU Scheduler Overview What does the scheduler do? vCPU HT / Core vCPU vCPU vCPU vCPU vCPU
  • 12. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 12 Dispatch VMs (its “worlds”) to honor CPU settings (Local) • For fairness: select VM with the least (consumed CPU time / fair share) CPU Scheduler Overview What does the scheduler do? vCPU HT / Core vCPU vCPU vCPU vCPU vCPU
  • 13. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 13 Dispatch VMs (its “worlds”) to honor CPU settings (Local) • For fairness: select VM with the least (consumed CPU time / fair share) • For priority: run latency-sensitive VM (high) before anyone else CPU Scheduler Overview What does the scheduler do? vCPU HT / Core vCPU vCPU vCPU vCPU vCPU vCPU IO
  • 14. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 14 LLC Place the worlds / threads on physical CPUs (Global) CPU Scheduler Overview What does the scheduler do? Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 LLC
  • 15. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 15 LLC Place the worlds / threads on physical CPUs (Global) CPU Scheduler Overview What does the scheduler do? • To balance load across physical execution contexts (PCPUs) Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 LLC VM VM VM VM
  • 16. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 16 LLC Place the worlds / threads on physical CPUs (Global) CPU Scheduler Overview What does the scheduler do? • To balance load across physical execution contexts (PCPUs) • To preserve cache state, minimize migration cost Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 LLC VM VM VM VM
  • 17. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 17 LLC Place the worlds / threads on physical CPUs (Global) CPU Scheduler Overview What does the scheduler do? • To balance load across physical execution contexts (PCPUs) • To preserve cache state, minimize migration cost • To avoid contention from hardware (HT, LLC, etc.) and sibling vCPUs (from the same VM) Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 LLC VM VM VM VM VM
  • 18. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 18 LLC Place the worlds / threads on physical CPUs (Global) CPU Scheduler Overview What does the scheduler do? • To balance load across physical execution contexts (PCPUs) • To preserve cache state, minimize migration cost • To avoid contention from hardware (HT, LLC, etc.) and sibling vCPUs (from the same VM) • To keep VMs or threads that have frequent communications close to each other Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 Core HT 0 HT 1 LLC VM VM VM VM VM VM VM
  • 19. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 19 CPU Scheduler Overview How does that look? 10:10:29am up 2 days 48 min, 674 worlds, 1 VMs, 2 vCPUs; CPU load average: 0.02, 0.01, 0.01 PCPU USED(%): 0.3 0.1 0.0 0.3 0.2 0.1 0.0 0.0 0.0 0.2 50 50 4.1 0.1 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 AVG: 4.4 PCPU UTIL(%): 0.5 0.1 0.1 0.6 0.2 0.2 0.0 0.2 0.0 0.3 100 100 4.2 0.2 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.2 0.1 0.1 AVG: 8.6 CORE UTIL(%): 0.6 0.7 0.4 0.9 0.3 100 4.3 0.2 0.0 0.1 0.4 0.7 AVG: 9.1 ID GID NAME NWLD %USED %RUN %SYS %WAIT %VMWAIT %RDY %IDLE %OVRLP 96337 148153 vmx 1 0.02 0.01 0.02 61.82 - 37.86 0.00 0.00 96339 148153 NetWorld-VM-96338 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96340 148153 NUMASchedRemapEpochInitial 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96341 148153 vmast.96338 1 0.03 0.05 0.00 99.63 - 0.00 0.00 0.00 96343 148153 vmx-vthread-6 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96344 148153 vmx-mks:Debian86 1 0.00 0.00 0.00 61.55 - 38.13 0.00 0.00 96345 148153 vmx-svga:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96346 148153 vmx-vcpu-0:Debian86 1 62.35 99.68 0.00 0.00 0.00 0.00 0.00 0.05 96348 148153 vmx-vcpu-1:Debian86 1 62.36 99.67 0.00 0.00 0.00 0.01 0.00 0.05 96347 148153 PVSCSI-96338:0 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96350 148153 vmx-vthread-7:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
  • 20. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 20 CPU Scheduler Overview How does that look? 10:10:29am up 2 days 48 min, 674 worlds, 1 VMs, 2 vCPUs; CPU load average: 0.02, 0.01, 0.01 PCPU USED(%): 0.3 0.1 0.0 0.3 0.2 0.1 0.0 0.0 0.0 0.2 50 50 4.1 0.1 0.1 0.0 0.0 0.0 0.1 0.0 0.0 0.1 0.0 0.0 AVG: 4.4 PCPU UTIL(%): 0.5 0.1 0.1 0.6 0.2 0.2 0.0 0.2 0.0 0.3 100 100 4.2 0.2 0.1 0.1 0.0 0.0 0.1 0.0 0.0 0.2 0.1 0.1 AVG: 8.6 CORE UTIL(%): 0.6 0.7 0.4 0.9 0.3 100 4.3 0.2 0.0 0.1 0.4 0.7 AVG: 9.1 ID GID NAME NWLD %USED %RUN %SYS %WAIT %VMWAIT %RDY %IDLE %OVRLP 96337 148153 vmx 1 0.02 0.01 0.02 61.82 - 37.86 0.00 0.00 96339 148153 NetWorld-VM-96338 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96340 148153 NUMASchedRemapEpochInitial 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96341 148153 vmast.96338 1 0.03 0.05 0.00 99.63 - 0.00 0.00 0.00 96343 148153 vmx-vthread-6 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96344 148153 vmx-mks:Debian86 1 0.00 0.00 0.00 61.55 - 38.13 0.00 0.00 96345 148153 vmx-svga:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96346 148153 vmx-vcpu-0:Debian86 1 62.35 99.68 0.00 0.00 0.00 0.00 0.00 0.05 96348 148153 vmx-vcpu-1:Debian86 1 62.36 99.67 0.00 0.00 0.00 0.01 0.00 0.05 96347 148153 PVSCSI-96338:0 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00 96350 148153 vmx-vthread-7:Debian86 1 0.00 0.00 0.00 99.68 - 0.00 0.00 0.00
  • 21. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 21 ? CPU Usage Accounting What states are there
  • 22. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 22 CPU Usage Accounting What states are there Not Running Running
  • 23. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 23 CPU Usage Accounting What states are there Idle (descheduled) Running Ready
  • 24. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 24 CPU Usage Accounting In an ideal world Idle (descheduled) Running Ready Usage
  • 25. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 25 CPU Usage Accounting What is charged against the VM Idle (descheduled) Running Ready Usage Overlap HT busy Frequency ..
  • 26. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 26 CPU Usage Accounting What is charged against the VM Idle (descheduled) Running Ready Usage Overlap HT busy Frequency .. “stolen time”
  • 27. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 27 CPU Usage Accounting What is charged against the VM Idle (descheduled) Running Ready Usage Overlap HT busy Frequency .. “stolen time” s y s V m w a I t wait
  • 28. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 28 CPU Usage Accounting What is charged against the VM Idle (descheduled) Running Ready Usage Overlap HT busy Frequency .. “stolen time” s y s V m w a I t wait C S T P R D Y M L M T
  • 29. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 29 %LAT_C captures the gap between “ideal” execution (demand) and “current” execution. • “Ideal”: unlimited dedicated cores running at nominal processor frequency stolen time aka “%LAT_C” CPU Usage Accounting Ideal Current Demand
  • 30. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 30 %LAT_C captures the gap between “ideal” execution (demand) and “current” execution. • “Ideal”: unlimited dedicated cores running at nominal processor frequency stolen time aka “%LAT_C” CPU Usage Accounting Ideal Current %LAT_C Demand
  • 31. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 31 %LAT_C captures the gap between “ideal” execution (demand) and “current” execution. • “Ideal”: unlimited dedicated cores running at nominal processor frequency stolen time aka “%LAT_C” CPU Usage Accounting Ideal Current %LAT_C Sources of Compute Latency: • VM resource contention: check %RDY and %CSTP • Power management (P-State): frequency throttling • Hardware contention: HTs are in use Demand
  • 32. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 32 Does enabling HT “spawn” a less capable “logical core”? Intel® Hyper-Threading Technology Cores and Threads
  • 33. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 33 Does enabling HT “spawn” a less capable “logical core”? Intel® Hyper-Threading Technology Cores and Threads “physical” core “logical” core “physical” core
  • 34. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 34 Does enabling HT “spawn” a less capable “logical core”? Intel® Hyper-Threading Technology Cores and Threads “physical” core “logical” core “physical” core
  • 35. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 35 Does enabling HT “spawn” a less capable “logical core”? Maybe two slightly less capable “logical” cores? Intel® Hyper-Threading Technology Cores and Threads “physical” core “logical” core “physical” core
  • 36. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 36 Does enabling HT “spawn” a less capable “logical core”? Maybe two slightly less capable “logical” cores? Intel® Hyper-Threading Technology Cores and Threads “physical” core “logical” core “physical” core “physical” core “logical” core0 “logical” core1
  • 37. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 37 Does enabling HT “spawn” a less capable “logical core”? Maybe two slightly less capable “logical” cores? Intel® Hyper-Threading Technology Cores and Threads “physical” core “logical” core “physical” core “physical” core “logical” core0 “logical” core1
  • 38. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 38 Does enabling HT “spawn” a less capable “logical core”? Maybe two slightly less capable “logical” cores? Intel® Hyper-Threading Technology Cores and Threads “physical” core “logical” core “physical” core “physical” core “logical” core0 “logical” core1
  • 39. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 39 Intel® Hyper-Threading Technology Individual throughput reduction, aggregated throughput increase at high load 100 100 ~125
  • 40. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 40 Intel® Hyper-Threading Technology on ESXi Throughput reduction is accounted for in USED 100 100
  • 41. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 41 Intel® Hyper-Threading Technology on ESXi Throughput reduction is accounted for in USED 100 100 125
  • 42. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 42 Intel® Hyper-Threading Technology on ESXi Throughput reduction is accounted for in USED 100 100 125 2 x 50 + 12.5 = 62.5
  • 43. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 43 Intel® Hyper-Threading Technology on ESXi Throughput reduction is accounted for in USED 100 100 125 HTEfficiencyShift – Default: 2 HT is: 1: 50 % 2: 25 % 3: 12.5 % 4: 6.25 % 5: 3.125 % more efficient than no-HT 2 x 50 + 12.5 = 62.5
  • 44. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 44 CPU Usage Accounting Usage vs. Utilization
  • 45. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 45 Umbrella Term Power Management
  • 46. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 46 Umbrella Term Power Management
  • 47. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 47 Umbrella Term Power Management P-States Options aka: Power Regulator, CPU Power Management, EIST
  • 48. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 48 Umbrella Term Power Management P-States Deep C-States Options aka: Power Regulator, CPU Power Management, EIST
  • 49. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 49 Power Management refresher … P-State = voltage / frequency point C-State = idle state, running or varying degrees of stuff turned off P2 P1 / NF P0 / TB Frequency C0 C1-Cn P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13
  • 50. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 50 C-State Transition
  • 51. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 51 C1 C1 C1 C1 C-State Transition
  • 52. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 52 C1 C1 C1 C1 C-State Transition ~1µs
  • 53. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 53 Deep C-State Transition
  • 54. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 54 Deep C-State Transition
  • 55. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 55 C6 C6 C6 C6 Deep C-State Transition
  • 56. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 56 C6 C6 C6 C6 Deep C-State Transition ~30µs
  • 57. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 57 Dell Power Management _Profiles_
  • 58. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 58 ESXi Power Management Policy Only affects what’s presented from the BIOS
  • 59. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 59 Who controls what? → allow control /  use Power Management refresher … CPU BIOS ESXi VM / guest
  • 60. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 60 Who controls what? → allow control /  use Power Management refresher … CPU BIOS ESXi VM / guest deep C- States P-States HLT / C1-Cn P-States
  • 61. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 61 Who controls what? → allow control /  use Power Management refresher … CPU BIOS ESXi VM / guest HLT / C1 deep C- States P-States HLT / C1-Cn P-States
  • 62. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 62 Who controls what? → allow control /  use Power Management refresher … CPU BIOS ESXi VM / guest HLT / C1 deep C- States P-States HLT / C1-Cn P-States
  • 63. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 63 ESXi Power Management Policy Only affects what’s presented from the BIOS (DELL terminology)
  • 64. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 64 ESXi Power Management Policy Only affects what’s presented from the BIOS (DELL terminology) System Profile → "Performance Per Watt (DAPC)" "Performance Per Watt (OS)" "Performance" "Dense Configuration" "Custom"
  • 65. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 65 ESXi Power Management Policy Only affects what’s presented from the BIOS (DELL terminology) System Profile → "Performance Per Watt (DAPC)" "Performance Per Watt (OS)" "Performance" "Dense Configuration" "Custom" CPU Power Management → "System DPBM (DAPC)" "OS DBPM" "Maximum Performance“
  • 66. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 66 ESXi Power Management Policy Only affects what’s presented from the BIOS (DELL terminology) System Profile → "Performance Per Watt (DAPC)" "Performance Per Watt (OS)" "Performance" "Dense Configuration" "Custom" CPU Power Management → "System DPBM (DAPC)" "OS DBPM" "Maximum Performance“ C States → "Enabled" "Disabled"
  • 67. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 67 ESXi Power Management Policy Only affects what’s presented from the BIOS (DELL terminology) System Profile → "Performance Per Watt (DAPC)" "Performance Per Watt (OS)" "Performance" "Dense Configuration" "Custom" CPU Power Management → "System DPBM (DAPC)" "OS DBPM" "Maximum Performance“ C States → "Enabled" "Disabled" P-States P-States P-States
  • 68. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 68 ESXi Power Management Policy Only affects what’s presented from the BIOS (DELL terminology) System Profile → "Performance Per Watt (DAPC)" "Performance Per Watt (OS)" "Performance" "Dense Configuration" "Custom" CPU Power Management → "System DPBM (DAPC)" "OS DBPM" "Maximum Performance“ C States → "Enabled" "Disabled" P-States P-States P-States
  • 69. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 69 ESXi Power Management Policy Only affects what’s presented from the BIOS (DELL terminology) System Profile → "Performance Per Watt (DAPC)" "Performance Per Watt (OS)" "Performance" "Dense Configuration" "Custom" CPU Power Management → "System DPBM (DAPC)" "OS DBPM" "Maximum Performance“ C States → "Enabled" "Disabled" P-States P-States P-States C-States C-States
  • 70. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 70 Most likely … Which BIOS policy am I running on?
  • 71. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 71 Most likely “Dynamic” Most likely … Which BIOS policy am I running on?
  • 72. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 72 Most likely “Dynamic” Most likely … Which BIOS policy am I running on?
  • 73. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 73 Most likely “Dynamic” Very likely “Performance” Most likely … Which BIOS policy am I running on?
  • 74. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 74 Most likely “Dynamic” Which BIOS policy am I running on? 4:30:58pm up 2 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00 Power Usage: 94W, Power Cap: N/A PSTATE MHZ: CPU %USED %UTIL %C0 %C1 %C2 %A/MPERF 0 0.3 0.7 1 23 76 50.0 1 0.0 0.0 0 0 100 50.1 2 0.1 0.2 0 6 94 50.0 3 0.0 0.0 0 0 100 50.1 4 5.2 10.4 10 5 85 50.0 5 0.0 0.0 0 5 95 51.0 6 0.0 0.1 0 3 97 50.0 7 0.0 0.0 0 0 100 50.0 8 0.1 0.4 0 16 84 50.0 9 0.0 0.0 0 0 100 50.0 10 0.0 0.0 0 0 100 50.0 (…)
  • 75. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 75 Most likely “Dynamic” Which BIOS policy am I running on? 4:30:58pm up 2 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00 Power Usage: 94W, Power Cap: N/A PSTATE MHZ: CPU %USED %UTIL %C0 %C1 %C2 %A/MPERF 0 0.3 0.7 1 23 76 50.0 1 0.0 0.0 0 0 100 50.1 2 0.1 0.2 0 6 94 50.0 3 0.0 0.0 0 0 100 50.1 4 5.2 10.4 10 5 85 50.0 5 0.0 0.0 0 5 95 51.0 6 0.0 0.1 0 3 97 50.0 7 0.0 0.0 0 0 100 50.0 8 0.1 0.4 0 16 84 50.0 9 0.0 0.0 0 0 100 50.0 10 0.0 0.0 0 0 100 50.0 (…)
  • 76. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 76 Most likely “Performance” Which BIOS policy am I running on? 4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00 Power Usage: 142W, Power Cap: N/A PSTATE MHZ: CPU %USED %UTIL %C0 %C1 %A/MPERF 0 0.0 0.1 0 100 108.3 1 0.1 0.1 0 100 108.4 2 0.1 0.1 0 100 108.3 3 0.0 0.1 0 100 108.4 4 0.0 0.0 0 100 108.3 5 18.0 16.7 17 83 108.3 6 0.0 0.1 0 100 108.4 7 0.2 0.2 0 100 108.3 8 0.0 0.0 0 100 108.3 9 0.1 0.2 0 100 108.3 10 0.0 0.1 0 100 108.3 (…)
  • 77. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 77 Most likely “Performance” Which BIOS policy am I running on? 4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00 Power Usage: 142W, Power Cap: N/A PSTATE MHZ: CPU %USED %UTIL %C0 %C1 %A/MPERF 0 0.0 0.1 0 100 108.3 1 0.1 0.1 0 100 108.4 2 0.1 0.1 0 100 108.3 3 0.0 0.1 0 100 108.4 4 0.0 0.0 0 100 108.3 5 18.0 16.7 17 83 108.3 6 0.0 0.1 0 100 108.4 7 0.2 0.2 0 100 108.3 8 0.0 0.0 0 100 108.3 9 0.1 0.2 0 100 108.3 10 0.0 0.1 0 100 108.3 (…)
  • 78. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 78 Most likely “Custom” Which BIOS policy am I running on? 5:09:53pm up 6 min, 827 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.01, 0.01, 0.00 Power Usage: 107W, Power Cap: N/A PSTATE MHZ: 2401 2400 2300 2200 2100 2000 1900 1800 1700 1600 1500 1400 1300 1200 CPU %USED %UTIL %C0 %C1 %C2 %P0 %P1 %P2 %P3 %P4 %P5 %P6 %P7 %P8 %P9 %P10 %P11 %P12 %P13 %A/MPERF 0 0.2 0.4 0 16 83 62 0 0 0 0 0 0 0 0 0 0 0 0 38 75.2 1 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 59.3 2 0.0 0.1 0 5 95 15 0 0 0 0 0 0 0 0 0 0 0 0 85 57.9 3 0.0 0.0 0 1 98 38 0 0 0 0 0 0 0 0 0 0 0 0 62 61.5 4 0.0 0.0 0 4 96 5 0 0 0 0 0 0 0 0 0 0 0 0 95 52.0 5 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 50.3 6 0.1 0.1 0 1 99 7 0 0 0 0 0 0 0 0 0 0 0 0 93 67.7 7 0.1 0.1 0 0 100 99 0 0 0 0 0 0 0 0 0 0 0 0 1 77.7 8 0.0 0.0 0 0 100 10 0 0 0 0 0 0 0 0 0 0 0 0 90 50.8 9 0.0 0.1 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 51.6 10 0.0 0.0 0 3 97 8 0 0 0 0 0 0 0 0 0 0 0 0 92 54.0 (…)
  • 79. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 79 Most likely “Custom” Which BIOS policy am I running on? 5:09:53pm up 6 min, 827 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.01, 0.01, 0.00 Power Usage: 107W, Power Cap: N/A PSTATE MHZ: 2401 2400 2300 2200 2100 2000 1900 1800 1700 1600 1500 1400 1300 1200 CPU %USED %UTIL %C0 %C1 %C2 %P0 %P1 %P2 %P3 %P4 %P5 %P6 %P7 %P8 %P9 %P10 %P11 %P12 %P13 %A/MPERF 0 0.2 0.4 0 16 83 62 0 0 0 0 0 0 0 0 0 0 0 0 38 75.2 1 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 59.3 2 0.0 0.1 0 5 95 15 0 0 0 0 0 0 0 0 0 0 0 0 85 57.9 3 0.0 0.0 0 1 98 38 0 0 0 0 0 0 0 0 0 0 0 0 62 61.5 4 0.0 0.0 0 4 96 5 0 0 0 0 0 0 0 0 0 0 0 0 95 52.0 5 0.0 0.0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 50.3 6 0.1 0.1 0 1 99 7 0 0 0 0 0 0 0 0 0 0 0 0 93 67.7 7 0.1 0.1 0 0 100 99 0 0 0 0 0 0 0 0 0 0 0 0 1 77.7 8 0.0 0.0 0 0 100 10 0 0 0 0 0 0 0 0 0 0 0 0 90 50.8 9 0.0 0.1 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 100 51.6 10 0.0 0.0 0 3 97 8 0 0 0 0 0 0 0 0 0 0 0 0 92 54.0 (…)
  • 80. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 80 The magic of Turbo Boost Dynamic, supported overclocking P1 TB1 Frequency C0 C-State depth P1 TB1 C1 C1 C1
  • 81. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 81 The magic of Turbo Boost Dynamic, supported overclocking P1 TB1 Frequency C0 C-State depth C6 P1 TB1 C1 C1 C1 P1 TB1 C0 P1 TB1 C6 C6 TB2 TB2 TB3 TB3 TB4 TB4
  • 82. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 82 The magic of Turbo Boost Dynamic, supported overclocking P1 TB1 Frequency C0 C-State depth C6 P1 TB1 C1 C1 C1 P1 TB1 C0 C6 C6 TB2 TB3 TB4 TB5 C6 TB6 TB7
  • 83. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 83 Power Policy “playfield" BIOS “Dynamic” pre Haswell Bad Good Optimal*
  • 84. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 84 Power Policy “playfield" BIOS “Dynamic” pre Haswell Bad Good Optimal* BIOS “Dynamic” on Haswell+
  • 85. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 85 Power Policy “playfield" BIOS “Dynamic” pre Haswell BIOS “Maximum / High Performance” Same* as Custom BIOS + High Performance ESXi policy (with the exception of C1E) Bad Good Optimal* BIOS “Dynamic” on Haswell+
  • 86. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 86 Power Policy “playfield" BIOS “Dynamic” pre Haswell BIOS “Maximum / High Performance” Same* as Custom BIOS + High Performance ESXi policy (with the exception of C1E) Custom BIOS + Custom or Balanced ESXi policy Bad Good Optimal* * a few workloads fare better with more deterministic performance BIOS “Dynamic” on Haswell+
  • 87. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 87 Power Policy “playfield" Custom done right!
  • 88. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 88 Power Policy “playfield" Custom done right! Custom BIOS + ESXi Balanced “Dynamic”
  • 89. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 89 Power Policy “playfield" Custom done right! Custom BIOS + ESXi Balanced “Dynamic”
  • 90. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 90 Power Policy “playfield" Custom done right! “Performance” Custom BIOS + ESXi Balanced “Dynamic” Custom BIOS + ESXi Balanced “Dynamic”
  • 91. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 91 Power Policy “playfield" Custom done right! “Performance” Custom BIOS + ESXi Balanced “Dynamic” Custom BIOS + ESXi Balanced “Dynamic”
  • 92. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 92 “Why doesn’t the frequency I see in Task Manager change?” Frequently Asked Questions Power Management Trivia
  • 93. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 93 “Why doesn’t the frequency I see in Task Manager change?” • Possibility 1: You are looking at the brand string • Possibility 2: You are looking in the right place (but the guest OS has no way of knowing) Frequently Asked Questions Power Management Trivia
  • 94. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 94 “Why doesn’t the frequency I see in Task Manager change?” • Possibility 1: You are looking at the brand string • Possibility 2: You are looking in the right place (but the guest OS has no way of knowing) • Base frequency should be: CPUID.(EAX=16h):EAX[15-00] – But it seems Windows is getting that from SMBIOS Frequently Asked Questions Power Management Trivia
  • 95. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 95 “Why doesn’t the frequency I see in Task Manager change?” • Possibility 1: You are looking at the brand string • Possibility 2: You are looking in the right place (but the guest OS has no way of knowing) • Base frequency should be: CPUID.(EAX=16h):EAX[15-00] – But it seems Windows is getting that from SMBIOS Frequently Asked Questions Power Management Trivia # grep cpuid ./WinTest.vmx cpuid.16.eax = "----------------0100011100011000" cpuid.coresPerSocket = "6" cpuid.brandstring = "VMware (R) SuperSecretCPU (R) @ 18.2 GHz"
  • 96. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 96 “I turned off all C-States, why is it still showing C1 in esxtop?” Frequently Asked Questions Power Management Trivia
  • 97. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 97 “I turned off all C-States, why is it still showing C1 in esxtop?” Frequently Asked Questions Power Management Trivia 4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00 Power Usage: 142W, Power Cap: N/A PSTATE MHZ: CPU %USED %UTIL %C0 %C1 %A/MPERF 0 0.0 0.1 0 100 108.3 1 0.1 0.1 0 100 108.4 2 0.1 0.1 0 100 108.3 3 0.0 0.1 0 100 108.4 4 0.0 0.0 0 100 108.3 5 18.0 16.7 17 83 108.3 6 0.0 0.1 0 100 108.4 7 0.2 0.2 0 100 108.3 8 0.0 0.0 0 100 108.3 (…)
  • 98. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 98 “I turned off all C-States, why is it still showing C1 in esxtop?” • You can’t turn off C1, you can disable different levels of deep C-States (C2+) Frequently Asked Questions Power Management Trivia 4:38:51pm up 1 min, 1276 worlds, 0 VMs, 0 vCPUs; CPU load average: 0.02, 0.00, 0.00 Power Usage: 142W, Power Cap: N/A PSTATE MHZ: CPU %USED %UTIL %C0 %C1 %A/MPERF 0 0.0 0.1 0 100 108.3 1 0.1 0.1 0 100 108.4 2 0.1 0.1 0 100 108.3 3 0.0 0.1 0 100 108.4 4 0.0 0.0 0 100 108.3 5 18.0 16.7 17 83 108.3 6 0.0 0.1 0 100 108.4 7 0.2 0.2 0 100 108.3 8 0.0 0.0 0 100 108.3 (…)
  • 99. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 99 “I won’t have any issues if I have everything set to High Performance in the BIOS, right?” Frequently Asked Questions Power Management Trivia
  • 100. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 100 “I won’t have any issues if I have everything set to High Performance in the BIOS, right?” • No, besides possibly: – PSU redundancy issues – Power capping – Temperature – Firmware bugs Frequently Asked Questions Power Management Trivia
  • 101. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 101 “I won’t have any issues if I have everything set to High Performance in the BIOS, right?” • No, besides possibly: – PSU redundancy issues – Power capping – Temperature – Firmware bugs • And definitely … – No ability to control P-/deep C-States – No maximum Turbo Boost frequencies … Frequently Asked Questions Power Management Trivia
  • 102. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 102 “I won’t have any issues if I have everything set to High Performance in the BIOS, right?” • No, besides possibly: – PSU redundancy issues – Power capping – Temperature – Firmware bugs • And definitely … – No ability to control P-/deep C-States – No maximum Turbo Boost frequencies … Frequently Asked Questions Power Management Trivia http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-v3-spec-update.pdf
  • 103. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 103 “I won’t have any issues if I have everything set to High Performance in the BIOS, right?” • No, besides possibly: – PSU redundancy issues – Power capping – Temperature – Firmware bugs • And definitely … – No ability to control P-/deep C-States – No maximum Turbo Boost frequencies … Frequently Asked Questions Power Management Trivia http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-v3-spec-update.pdf
  • 104. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 104 Frequently Asked Questions Power Management Trivia
  • 105. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 105 “I can clearly see C2 in perfmon on Windows, why are you lying to me?” Frequently Asked Questions Power Management Trivia
  • 106. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 106 “I can clearly see C2 in perfmon on Windows, why are you lying to me?” • This is either a perfmon bug or a choice to represent an “enlightened” idle feature – “Intelligent Timer Tick Distribution (ITTD)” – needs Windows 2012 R2 / vHW 11 – disable via “monitor.disable_guest_idle_msr = true” • you really shouldn’t have to ever … Frequently Asked Questions Power Management Trivia
  • 107. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 107 What runs where and when The high level picture CPU VMK VMM OS / APPs
  • 108. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 108 What runs where and when Mostly Direct Exec CPU OS / APPs
  • 109. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 109 What runs where and when Mostly Direct Exec PCPU vCPU (…) 0xffffffff810a99d0 <+416>: test %eax,%eax 0xffffffff810a99d2 <+418>: je 0xffffffff810a9932 <cpu_startup_entry+258> 0xffffffff810a99d8 <+424>: callq 0xffffffff810c6ed0 <rcu_irq_enter> 0xffffffff810a99dd <+429>: mov 0x82740c(%rip),%r13 0xffffffff810a99e4 <+436>: test %r13,%r13 0xffffffff810a99e7 <+439>: je 0xffffffff810a9a07 <cpu_startup_entry+471> 0xffffffff810a99e9 <+441>: mov 0x0(%r13),%rax 0xffffffff810a99ed <+445>: no0xffffffff810a99f0 <+448>: mov 0x8(%r13),%rdi 0xffffffff810a99f4 <+452>: add $0x10,%r13 0xffffffff810a99f8 <+456>: xor %esi,%esi 0xffffffff810a99fa <+458>: mov %ebp,%edx 0xffffffff810a99fc <+460>: callq *%rax 0xffffffff810a99fe <+462>: mov 0x0(%r13),%rax 0xffffffff810a9a02 <+466>: test %rax,%rax 0xffffffff810a9a05 <+469>: jne 0xffffffff810a99f0 <cpu_startup_entry+448> 0xffffffff810a9a07 <+471>: callq 0xffffffff810c6e40 <rcu_irq_exit> 0xffffffff810a9a0c <+476>: jmpq 0xffffffff810a9932 <cpu_startup_entry+258> 0xffffffff810a9a11 <+481>: nopl 0x0(%rax) 0xffffffff810a9a18 <+488>: mov %gs:0xa0e4,%eax 0xffffffff810a9a20 <+496>: mov %eax,%eax 0xffffffff810a9a22 <+498>: bt %rax,(%rbx) (…)
  • 110. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 110 What runs where and when What about Idle? CPU vCPU (…) 0xffffffff81052c20 <+0>: sti 0xffffffff81052c21 <+1>: hlt *loud screeching sound*
  • 111. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 111 What runs where and when VMM traps on the privileged instruction and puts (with VMK) the vCPU to “sleep CPU VMM (…) 0xffffffff81052c20 <+0>: sti 0xffffffff81052c21 <+1>: hlt *tells VMK to deschedule*
  • 112. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 112 What runs where and when The scheduler decides what next to run CPU VMK
  • 113. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 113 What runs where and when E.g. a vCPU / world that is ready to run CPU other vCPU
  • 114. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 114 What runs where and when ESXi’s _own_ idle thread CPU C1-Cn
  • 115. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 115 Manage host physical memory to abstract physical memory away from guest. Allow memory over-commitment to provide an illusion of virtual DRAM to the guest. Hide transient host memory pressure from application Memory Management Overview Goals and Objectives Host Physical Memory Guest Memory ESXi
  • 116. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 116 Virtual Memory Process 0
  • 117. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 117 Virtual Memory Process 0 Process 1 Process 2 Process 3 Process n
  • 118. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 118 Virtual Memory From the process’ point of view, it provides: • Contiguous address space • Isolation / Security Process 0 Process 1 Process 2 Process 3 Process n 256 TB 256 TB 256 TB 256 TB 256 TB
  • 119. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 119 Virtual Memory From the process’ point of view, it provides: • Contiguous address space • Isolation / Security Virtual Memory abstracts Process 0 Process 1 Process 2 Process 3 Process n Magic 256 TB 256 TB 256 TB 256 TB 256 TB
  • 120. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 120 Virtual Memory From the process’ point of view, it provides: • Contiguous address space • Isolation / Security Virtual Memory abstracts • It provides the possibility to overcommit … Process 0 Process 1 Process 2 Process 3 Process n Magic 256 TB 256 TB 256 TB 256 TB 256 TB
  • 121. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 121 Virtual Memory From the process’ point of view, it provides: • Contiguous address space • Isolation / Security Virtual Memory abstracts • It provides the possibility to overcommit … The process is unaware what is backing the virtual address • Physical Memory • Swap File Process 0 Process 1 Process 2 Process 3 Process n Magic 256 TB 256 TB 256 TB 256 TB 256 TB 64 TB 256 TB
  • 122. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 122 Virtual Physical Memory VM 0 Abstraction …
  • 123. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 123 Virtual Physical Memory VM 0 VM 1 VM 2 VM 3 VM n Abstraction …
  • 124. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 124 Virtual Physical Memory From the VMs point of view, it provides: • Contiguous address space • Isolation / Security VM 0 VM 1 VM 2 VM 3 VM n 6 TB 6 TB 6 TB 6 TB 6 TB Abstraction …
  • 125. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 125 Virtual Physical Memory From the VMs point of view, it provides: • Contiguous address space • Isolation / Security Virt. Physical Mem. abstracts VM 0 VM 1 VM 2 VM 3 VM n Magic 6 TB 6 TB 6 TB 6 TB 6 TB Abstraction …
  • 126. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 126 Virtual Physical Memory From the VMs point of view, it provides: • Contiguous address space • Isolation / Security Virt. Physical Mem. abstracts • It provides the possibility to overcommit … VM 0 VM 1 VM 2 VM 3 VM n Magic 6 TB 6 TB 6 TB 6 TB 6 TB Abstraction …
  • 127. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 127 Virtual Physical Memory From the VMs point of view, it provides: • Contiguous address space • Isolation / Security Virt. Physical Mem. abstracts • It provides the possibility to overcommit … The VM is unaware what is backing the physical address • Physical Memory • Swap File VM 0 VM 1 VM 2 VM 3 VM n Magic 6 TB 6 TB 6 TB 6 TB 6 TB 16 TB *** TB Abstraction …
  • 128. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 128 Virtual Physical Memory From the VMs point of view, it provides: • Contiguous address space • Isolation / Security Virt. Physical Mem. abstracts • It provides the possibility to overcommit … The VM is unaware what is backing the physical address • Physical Memory • Swap File • Or COW, ZIP, BLN VM 0 VM 1 VM 2 VM 3 VM n Magic 6 TB 6 TB 6 TB 6 TB 6 TB 16 TB *** TB *** TB Abstraction … *** TB
  • 129. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 129 Virtual Physical Memory From the VMs point of view, it provides: • Contiguous address space • Isolation / Security Virt. Physical Mem. abstracts • It provides the possibility to overcommit … The VM is unaware what is backing the physical address • Physical Memory • Swap File • Or COW, ZIP, BLN VM 0 VM 1 VM 2 VM 3 VM n Magic 6 TB 6 TB 6 TB 6 TB 6 TB 16 TB *** TB *** TB Abstraction … *** TB *** TB *
  • 130. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 130 Understanding VM memory usage on ESXi Memory Management Overview How to Hide Memory Pressure?
  • 131. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 131 Understanding VM memory usage on ESXi Memory Management Overview How to Hide Memory Pressure? Total Memory Size
  • 132. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 132 Understanding VM memory usage on ESXi Memory Management Overview How to Hide Memory Pressure? Total Memory Size Allocated Memory Free Memory
  • 133. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 133 Understanding VM memory usage on ESXi Memory Management Overview How to Hide Memory Pressure? Total Memory Size Allocated Memory Free Memory Active Memory Idle Memory
  • 134. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 134 Understanding VM memory usage on ESXi Reclaim memory from VM if it using more than it is entitled. • Entitlement depends on configuration (reservation / shares / limit). • Techniques to reclaim memory from VMs includes: – Page sharing > Ballooning > Compression > Host swapping – Breaks host large pages Memory Management Overview How to Hide Memory Pressure? Total Memory Size Allocated Memory Free Memory Active Memory Idle Memory
  • 135. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 135 Active Memory Not the same as guest stats!
  • 136. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 136 Active Memory Not the same as guest stats!
  • 137. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 137 Active Memory Not the same as guest stats!
  • 138. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 138 Active Memory Not the same as guest stats! !=
  • 139. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 139 Active Memory ESXi VM level heuristic • Weighted, moving average • OS / VMTools independent • “Memory Sampling” aka Touched
  • 140. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 140 Active Memory ESXi VM level heuristic • Weighted, moving average • OS / VMTools independent • “Memory Sampling” Un-maps 100 random pages over the entire VMs mapped address space aka Touched VM mapped memory 4 KB 100 x 4 KB 4 KB 4 KB 4 KB 4 KB 4 KB …
  • 141. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 141 Active Memory ESXi VM level heuristic • Weighted, moving average • OS / VMTools independent • “Memory Sampling” Un-maps 100 random pages over the entire VMs mapped address space Monitors R/W for a minute (access traps to the VMM) aka Touched VM mapped memory 4 KB 100 x 4 KB 4 KB 4 KB 4 KB 4 KB 4 KB … / min
  • 142. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 142 Active Memory ESXi VM level heuristic • Weighted, moving average • OS / VMTools independent • “Memory Sampling” Un-maps 100 random pages over the entire VMs mapped address space Monitors R/W for a minute (access traps to the VMM) aka Touched VM mapped memory 4 KB 100 x 4 KB 4 KB 4 KB 4 KB 4 KB 4 KB … / min Read Read Write
  • 143. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 143 Active Memory ESXi VM level heuristic • Weighted, moving average • OS / VMTools independent • “Memory Sampling” Un-maps 100 random pages over the entire VMs mapped address space Monitors R/W for a minute (access traps to the VMM) After one minute, re-maps all remaining pages, starts again aka Touched VM mapped memory 4 KB 100 x 4 KB 4 KB 4 KB 4 KB 4 KB 4 KB … / min
  • 144. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 144 Active Memory vs. Consumed
  • 145. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 145 Active Memory What to trust? consumed active
  • 146. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 146 Active Memory What to trust? consumed active
  • 147. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 147 Active Memory What to trust? consumed active
  • 148. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 148 Active Memory What to trust? consumed active
  • 149. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 149 Active Memory What to trust? consumed active
  • 150. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 150 Active Memory What to trust? consumed active
  • 151. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 151 Active Memory What to trust? active consumed
  • 152. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 152 Active Memory What to trust? active consumed
  • 153. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 153 Guest Memory Metrics In a nutshell
  • 154. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 154 Guest Memory Metrics In a nutshell
  • 155. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 155 Guest Memory Metrics In a nutshell
  • 156. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 156 Guest Memory Metrics In a nutshell
  • 157. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 157 Guest Memory Metrics In a nutshell
  • 158. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 158 Active Memory Guests working set tends to be between active and consumed consumed active guest WS
  • 159. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 159 Active Memory Guest WS might over report (greedy app) active guest WS
  • 160. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 160 Active Memory But guest WS will not underreport consumed active guest WS
  • 161. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 161 Active Memory Not then end all of guest workload estimation
  • 162. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 162 Hierarchical Resource Groups From an ESXi perspective host The host owns all resources
  • 163. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 163 Hierarchical Resource Groups From an ESXi perspective host system vim iofilters user The host owns all resources Those are distributed by hierarchical resource groups
  • 164. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 164 Hierarchical Resource Groups From an ESXi perspective host system vim iofilters user The host owns all resources Those are distributed by hierarchical resource groups minfree kernel helper ft drivers vmotion …
  • 165. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 165 Hierarchical Resource Groups From an ESXi perspective host system vim iofilters user The host owns all resources Those are distributed by hierarchical resource groups minfree kernel helper ft drivers vmotion … vmkboot CpuSched Init …
  • 166. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 166 Hierarchical Resource Groups From an ESXi perspective host system vim iofilters user The host owns all resources Those are distributed by hierarchical resource groups Consumers can demand (request) resources minfree kernel helper ft drivers vmotion … vmkboot CpuSched Init …
  • 167. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 167 Hierarchical Resource Groups From an ESXi perspective host system vim iofilters user vCenter shows the sum of all user resources as: Total Reservation Capacity
  • 168. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 168 Hierarchical Resource Groups From an ESXi perspective host system vim iofilters user vCenter shows the sum of all user resources as: Total Reservation Capacity Global Resource Pools are then distributed back to hosts into Local RPs • Based on VMs demand … pool4 pool3 pool2 pool1
  • 169. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 169 Hierarchical Resource Groups From an ESXi perspective host system vim iofilters user vCenter shows the sum of all user resources as: Total Reservation Capacity Global Resource Pools are then distributed back to hosts into Local RPs • Based on VMs demand … vm.vmid vm.vmid vm.vmid … pool4 pool3 pool2 pool1
  • 170. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 170 Hierarchical Resource Groups From an ESXi perspective user Local Resource Groups are created and incrementally numbered when clients are instantiated: • VM starts / vMotions etc. • Based on VMs demand … vm.vmid vm.vmid … pool430 pool231 pool15 pool1
  • 171. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 171 Hierarchical Resource Groups From an ESXi perspective user Local Resource Groups are created and incrementally numbered when clients are instantiated: • VM starts / vMotions etc. • Based on VMs demand The local hierarchy is equal to the global one • Check for VM / LRG siblings … vm.vmid vm.vmid … pool430 pool231 pool15 pool1 vm.vmid pool321 vm.vmid vm.vmid …
  • 172. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 172 Hierarchical Resource Groups From an ESXi perspective user Local Resource Groups are created and incrementally numbered when clients are instantiated: • VM starts / vMotions etc. • Based on VMs demand The local hierarchy is equal to the global one • Check for VM / LRG siblings VM groups have multiple leaf consumers • vmid is local, not global … vm.vmid vm.vmid … pool430 pool231 pool15 pool1 vm.vmid pool321 vm.vmid vm.vmid … vmm uw ...
  • 173. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 173 cpu.resv Reservation cpu.limit Limit cpu.shares Shares cpu.resvLimit Expandable* mem.resv Reservation mem.limit Limit mem.shares Shares mem.resvLimit Expandable* Memory CPU Hierarchical Resource Groups Both Memory and CPU resources host system vim iofilters user … vm.vmid vm.vmid vm.vmid … pool4 pool3 pool2 pool1
  • 174. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 174 ESXi CLI (via SSH) … for CPU … for Memory … for comparison Tools sched-stats memstats esxtop
  • 175. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 175 Tools cmdline for local groups (no VMs) sched-stats
  • 176. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 176 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 sched-stats
  • 177. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 177 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 || $2 ~ /^(vm.|pool)[0-9]+/ sched-stats
  • 178. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 178 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 || $2 ~ /^(vm.|pool)[0-9]+/ || /^ +[0-4] / sched-stats
  • 179. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 179 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 || $2 ~ /^(vm.|pool)[0-9]+/ || /^ +[0-4] / {printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn" ,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}' sched-stats
  • 180. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 180 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 || $2 ~ /^(vm.|pool)[0-9]+/ || /^ +[0-4] / {printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn" ,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}' vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz 0 host 0 933 1600 1600 1600 pct 4096000 5232 33168 1 system 0 659 10 -1 -1 pct 500 288 33168 2 vim 0 271 4944 -1 -1 mhz 500 4344 33768 3 iofilters 0 3 0 -1 -1 pct 1000 0 33168 4 user 0 0 0 -1 -1 pct 9000 0 33168 sched-stats
  • 181. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 181 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 || $2 ~ /^(vm.|pool)[0-9]+/ || /^ +[0-4] / {printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn" ,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}' vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz 0 host 0 933 1600 1600 1600 pct 4096000 5232 33168 1 system 0 659 10 -1 -1 pct 500 288 33168 2 vim 0 271 4944 -1 -1 mhz 500 4344 33768 3 iofilters 0 3 0 -1 -1 pct 1000 0 33168 4 user 0 0 0 -1 -1 pct 9000 0 33168 sched-stats
  • 182. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 182 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 || $2 ~ /^(vm.|pool)[0-9]+/ || /^ +[0-4] / {printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn" ,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}' vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz 0 host 0 933 1600 1600 1600 pct 4096000 5232 33168 1 system 0 659 10 -1 -1 pct 500 288 33168 2 vim 0 271 4944 -1 -1 mhz 500 4344 33768 3 iofilters 0 3 0 -1 -1 pct 1000 0 33168 4 user 0 0 0 -1 -1 pct 9000 0 33168 sched-stats
  • 183. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 183 Tools cmdline for local groups (no VMs) # sched-stats -t groups | awk 'NR == 1 || $2 ~ /^(vm.|pool)[0-9]+/ || /^ +[0-4] / {printf ("%-10s%-12s%-9s%-6s%-6s%-6s%-9s%-6s%-9s%-9s%-10sn" ,$1, $2, $3, $6, $8, $9, $10, $11, $12, $13, $14)}' vmgid name pgid vsmps amin amax minLimit units ashares resvMHz availMHz 0 host 0 933 1600 1600 1600 pct 4096000 5232 33168 1 system 0 659 10 -1 -1 pct 500 288 33168 2 vim 0 271 4944 -1 -1 mhz 500 4344 33768 3 iofilters 0 3 0 -1 -1 pct 1000 0 33168 4 user 0 0 0 -1 -1 pct 9000 0 33168 sched-stats
  • 184. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 184 Tools cmdline for local groups (with VMs) # memstats -r group-stats -g0 -l2 -s gid:name:min:max::conResv:availResv -u mb | sed -n '/^-+/,/.*n/p' --------------------------------------------------------------------------------- gid name min max conResv availResv --------------------------------------------------------------------------------- 0 host 97823 97823 28917 68907 1 system 20024 -1 20008 68923 2 vim 0 -1 3378 68907 3 iofilters 0 -1 25 68907 4 user 0 -1 5490 68907 --------------------------------------------------------------------------------- memstats
  • 185. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 185 Tools cmdline for local groups (with VMs) # memstats -r group-stats -g0 -l2 -s gid:name:min:max::conResv:availResv -u mb | sed -n '/^-+/,/.*n/p' --------------------------------------------------------------------------------- gid name min max conResv availResv --------------------------------------------------------------------------------- 0 host 97823 97823 28917 68907 1 system 20024 -1 20008 68923 2 vim 0 -1 3378 68907 3 iofilters 0 -1 25 68907 4 user 0 -1 5490 68907 --------------------------------------------------------------------------------- memstats
  • 186. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 186 Tools cmdline for local groups (with VMs) # memstats -r group-stats -g0 -l2 -s gid:name:min:max::conResv:availResv -u mb | sed -n '/^-+/,/.*n/p' --------------------------------------------------------------------------------- gid name min max conResv availResv --------------------------------------------------------------------------------- 0 host 97823 97823 28917 68907 1 system 20024 -1 20008 68923 2 vim 0 -1 3378 68907 3 iofilters 0 -1 25 68907 4 user 0 -1 5490 68907 --------------------------------------------------------------------------------- memstats
  • 187. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 187 (N)UMA + terminology
  • 188. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 188 DIMMs (N)UMA + terminology
  • 189. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 189 DIMMs Socket / Package (N)UMA + terminology 0
  • 190. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 190 DIMMs Socket / Package NUMA node (N)UMA + terminology 0
  • 191. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 191 DIMMs Socket / Package NUMA node (N)UMA + terminology 0 1
  • 192. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 192 DIMMs Socket / Package NUMA node Socket != NUMA node (N)UMA + terminology 0 2 1
  • 193. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 193 DIMMs Socket / Package NUMA node Socket != NUMA node (N)UMA + terminology 0 2 1
  • 194. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 194 DIMMs Socket / Package NUMA node Socket != NUMA node LLC / DIE (N)UMA + terminology 0 2 1
  • 195. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 195 DIMMs Socket / Package NUMA node Socket != NUMA node LLC / DIE (CoD, SNC / Zen1/2) (N)UMA + terminology 0 2 1
  • 196. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 196 Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted)
  • 197. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 197 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted)
  • 198. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 198 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head
  • 199. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 199 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head = register / 1 cycle
  • 200. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 200 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head = register / 1 cycle this room
  • 201. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 201 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head = register / 1 cycle this room = L1-L2 / 10 cycles
  • 202. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 202 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head = register / 1 cycle this room = L1-L2 / 10 cycles this building
  • 203. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 203 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head = register / 1 cycle this room = L1-L2 / 10 cycles this building = DRAM / 100 cycles
  • 204. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 204 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head = register / 1 cycle this room = L1-L2 / 10 cycles this building = DRAM / 100 cycles Finland + Algeria
  • 205. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 205 You want to calculate a + b and the operands are in: Importance of Memory Access Latency Jim Gray’s Storage Latency Analogy (slightly adapted) your head = register / 1 cycle this room = L1-L2 / 10 cycles this building = DRAM / 100 cycles Finland + Algeria = Disk / 10^6 cycles
  • 206. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 206 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz
  • 207. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 207 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz
  • 208. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 208 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz
  • 209. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 209 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz access size cycles ns L3 / Last Level Cache core 0 core 1 core 2 core 3 L1 L1 L1 L1 L2 L2 L2 L2 IMC QPI DRAM
  • 210. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 210 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz access size cycles ns L1 32 KB 4-5 1.5 L3 / Last Level Cache core 0 core 1 core 2 core 3 L1 L1 L1 L1 L2 L2 L2 L2 IMC QPI DRAM
  • 211. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 211 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz access size cycles ns L1 32 KB 4-5 1.5 L2 256 KB 12 4 L3 / Last Level Cache core 0 core 1 core 2 core 3 L1 L1 L1 L1 L2 L2 L2 L2 IMC QPI DRAM
  • 212. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 212 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz access size cycles ns L1 32 KB 4-5 1.5 L2 256 KB 12 4 L3 8 MB 30 10 L3 / Last Level Cache core 0 core 1 core 2 core 3 L1 L1 L1 L1 L2 L2 L2 L2 IMC QPI DRAM
  • 213. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 213 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz access size cycles ns L1 32 KB 4-5 1.5 L2 256 KB 12 4 L3 8 MB 30 10 L3 / Last Level Cache core 0 core 1 core 2 core 3 L1 L1 L1 L1 L2 L2 L2 L2 IMC QPI DRAM
  • 214. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 214 Importance of Memory Access Latency Numbers based on Intel i7-3770 @ 3.4 GHz access size cycles ns L1 32 KB 4-5 1.5 L2 256 KB 12 4 L3 8 MB 30 10 DRAM GBs 30+ 66* L3 / Last Level Cache core 0 core 1 core 2 core 3 L1 L1 L1 L1 L2 L2 L2 L2 IMC QPI DRAM
  • 215. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 215 N(UMA) All sockets share the FSB to the Northbridge and hence the bandwidth • NB also known as “Memory Controller Hub” or MCH Uniform memory access latency between every CPU and every DIMM Von Neumann Bottleneck getting worse with faster CPUs / more RAM Pre-Opteron/Nehalem 1 2 NB 0 3
  • 216. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 216 N(UMA) All sockets share the FSB to the Northbridge and hence the bandwidth • NB also known as “Memory Controller Hub” or MCH Uniform memory access latency between every CPU and every DIMM Von Neumann Bottleneck getting worse with faster CPUs / more RAM Pre-Opteron/Nehalem 1 2 NB 0 3
  • 217. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 217 N(UMA) All sockets share the FSB to the Northbridge and hence the bandwidth • NB also known as “Memory Controller Hub” or MCH Uniform memory access latency between every CPU and every DIMM Von Neumann Bottleneck getting worse with faster CPUs / more RAM Pre-Opteron/Nehalem 1 2 NB 0 3
  • 218. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 218 0 1 3 2 NUMA Every NUMA node has its own Integrated Memory Controller (IMC) • Some AMD’s (Bulldozer and newer) have two nodes per socket / package Remote access has to go over the interconnect and remote CPU’s IMC • This adds additional latency making local and remote access Non-Uniform Post-Opteron/Nehalem
  • 219. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 219 0 1 3 2 NUMA Every NUMA node has its own Integrated Memory Controller (IMC) • Some AMD’s (Bulldozer and newer) have two nodes per socket / package Remote access has to go over the interconnect and remote CPU’s IMC • This adds additional latency making local and remote access Non-Uniform Post-Opteron/Nehalem
  • 220. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 220 0 1 3 2 NUMA Every NUMA node has its own Integrated Memory Controller (IMC) • Some AMD’s (Bulldozer and newer) have two nodes per socket / package Remote access has to go over the interconnect and remote CPU’s IMC • This adds additional latency making local and remote access Non-Uniform Post-Opteron/Nehalem
  • 221. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 221 0 1 3 2 NUMA Every NUMA node has its own Integrated Memory Controller (IMC) • Some AMD’s (Bulldozer and newer) have two nodes per socket / package Remote access has to go over the interconnect and remote CPU’s IMC • This adds additional latency making local and remote access Non-Uniform Post-Opteron/Nehalem
  • 222. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 222 0 1 3 2 NUMA 2 QPI / IC CPU /ns 0 1 2 3 0 72 291 323 294 1 296 72 293 315 2 319 296 71 296 3 290 325 300 71 local adjacent “routed”
  • 223. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 223 CPU /ns 0 1 2 3 0 136 194 198 201 1 194 135 194 196 2 201 194 135 200 3 202 197 198 135 0 1 3 2 NUMA 3 QPI / IC local adjacent
  • 224. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 224 0 1 3 2 NUMA Basic Migration Types NUMA clients (vCPUs + memory) are kept local to a home node Balance migrations re-assign the home node, memory follows vCPUs! Locality migrations set home node to where the most memory resides
  • 225. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 225 0 1 3 2 NUMA Basic Migration Types NUMA clients (vCPUs + memory) are kept local to a home node Balance migrations re-assign the home node, memory follows vCPUs! Locality migrations set home node to where the most memory resides
  • 226. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 226 0 1 3 2 NUMA Basic Migration Types NUMA clients (vCPUs + memory) are kept local to a home node Balance migrations re-assign the home node, memory follows vCPUs! Locality migrations set home node to where the most memory resides
  • 227. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 227 0 1 3 2 NUMA Basic Migration Types NUMA clients (vCPUs + memory) are kept local to a home node Balance migrations re-assign the home node, memory follows vCPUs! Locality migrations set home node to where the most memory resides
  • 228. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 228 0 1 3 2 NUMA Basic Migration Types NUMA clients (vCPUs + memory) are kept local to a home node Balance migrations re-assign the home node, memory follows vCPUs! Locality migrations set home node to where the most memory resides
  • 229. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 229 0 1 3 2 NUMA Basic Migration Types NUMA clients (vCPUs + memory) are kept local to a home node Balance migrations re-assign the home node, memory follows vCPUs! Locality migrations set home node to where the most memory resides
  • 230. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 230 0 1 3 2 NUMA Basic Migration Types NUMA clients (vCPUs + memory) are kept local to a home node Balance migrations re-assign the home node, memory follows vCPUs! Locality migrations set home node to where the most memory resides
  • 231. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 231 NUMA migration incurs significant cost. • All pages need to be remapped, i.e. %localMemory initially drops to 0% and slowly recovers. • Copying memory pages across NUMA boundaries cost memory bandwidth. NUMA Scheduler Consideration Local Contention vs Remote Access
  • 232. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 232 NUMA migration incurs significant cost. • All pages need to be remapped, i.e. %localMemory initially drops to 0% and slowly recovers. • Copying memory pages across NUMA boundaries cost memory bandwidth. NUMA Scheduler Consideration Local Contention vs Remote Access 0 10 20 30 40 50 60 70 80 90 100 0 1 2 3 4 5 6 7 8 9 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 %Local-Mem #Migrations time (30sec) Memory Locality & NUMA-migrations (with NUMA Migration) %local #migrations 0 20 40 60 80 100 120 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 %Local #Migrations time (30sec units) Memory Locality & NUMA-migrations (No NUMA Migration) %local #migrations
  • 233. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 233 We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 234. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 234 We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 235. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 235 We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) cpuid.coresPerSocket My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 236. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 236 CPS in GUI & supported We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) cpuid.coresPerSocket My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 237. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 237 Max vSMP 8 CPS in GUI & supported We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) cpuid.coresPerSocket My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 238. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 238 Max vSMP 8 CPS in GUI & supported We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) cpuid.coresPerSocket vNUMA My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 239. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 239 Max vSMP 32 Max vSMP 8 CPS in GUI & supported We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) cpuid.coresPerSocket vNUMA My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 240. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 240 numa.vcpu.min = 9 Max vSMP 32 Max vSMP 8 CPS in GUI & supported We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) cpuid.coresPerSocket vNUMA My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5
  • 241. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 241 numa.vcpu.min = 9 Max vSMP 32 Max vSMP 8 CPS in GUI & supported We had good(ish) reasonsos vNUMA auto-sizing history (…) 2007 2008 2009 2010 2011 2012 2013 2014 (…) cpuid.coresPerSocket vNUMA My starting data @ VMware ESX 4.0 ESX 4.1 ESXi 5.0 ESXi 5.1 ESXi 5.5 ESXi 6.0 ESX 3.5 cpuid.coresPerSocket → numa.vcpu.maxPerVirtualNode
  • 242. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 242 VPD doesn’t affect ESXi sched. PPD does define ESXi NUMA sched. • AKA NUMA client Doesn’t influence ESXi sched. Might influence Guest / App sched. CPU Topology vNUMA Topology Two level’s of abstraction Virtual and Physical Proximity Domains VPD PPD CPS
  • 243. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 243 VPD doesn’t affect ESXi sched. PPD does define ESXi NUMA sched. • AKA NUMA client Doesn’t influence ESXi sched. Might influence Guest / App sched. CPU Topology vNUMA Topology Two level’s of abstraction Virtual and Physical Proximity Domains VPD PPD C PPD VPD C C C C C
  • 244. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 244 VPD doesn’t affect ESXi sched. PPD does define ESXi NUMA sched. • AKA NUMA client Doesn’t influence ESXi sched. Might influence Guest / App sched. CPU Topology vNUMA Topology Two level’s of abstraction Virtual and Physical Proximity Domains VPD PPD CPS PPD
  • 245. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 245 VPD doesn’t affect ESXi sched. PPD does define ESXi NUMA sched. • AKA NUMA client Doesn’t influence ESXi sched. Might influence Guest / App sched. CPU Topology vNUMA Topology Two level’s of abstraction Virtual and Physical Proximity Domains VPD PPD CPS PPD VPD
  • 246. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 246
  • 247. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 247 Running Compute Intensive Benchmark Case Study: Project Pacific https://blogs.vmware.com/performance/2019/10/how-does-project-pacific-deliver-8-better- performance-than-bare-metal.html
  • 248. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 248 Running Compute Intensive Benchmark Case Study: Project Pacific 43.5% local memory access on native Linux https://blogs.vmware.com/performance/2019/10/how-does-project-pacific-deliver-8-better- performance-than-bare-metal.html
  • 249. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 249 Running Compute Intensive Benchmark Case Study: Project Pacific 43.5% local memory access on native Linux 99.2% local memory access on Pacific Cluster https://blogs.vmware.com/performance/2019/10/how-does-project-pacific-deliver-8-better- performance-than-bare-metal.html
  • 250. 250 DOAG 2020 │ ©2020 VMware, Inc. IO stuff
  • 251. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. vSphere 6.0 achieves Line Rate throughput on a 40GigE NIC Throughput ↑ from 20.5 to 35.5 Gbps CPU Used ↓ from 36 to 13 % (per Gbps) Herculean Network IO
  • 252. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 252 By default, vSphere tunes for lower CPU usage by batching I/O operations Virtual NIC coalescing - recap Trading CPU Cycles for Lower Latency Network
  • 253. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 253 By default, vSphere tunes for lower CPU usage by batching I/O operations • By default, that is also the case for the RX and TX path on vNICs (here vmxnet3) • When disabled: – Every packet received interrupts immediately – Every packet will be issued immediately Virtual NIC coalescing - recap Trading CPU Cycles for Lower Latency Network
  • 254. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 254 By default, vSphere tunes for lower CPU usage by batching I/O operations • By default, that is also the case for the RX and TX path on vNICs (here vmxnet3) • When disabled: – Every packet received interrupts immediately – Every packet will be issued immediately Virtual NIC coalescing - recap Trading CPU Cycles for Lower Latency Network
  • 255. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 255 By default, vSphere tunes for lower CPU usage by batching I/O operations • By default, that is also the case for the RX and TX path on vNICs (here vmxnet3) • When disabled: – Every packet received interrupts immediately – Every packet will be issued immediately Virtual NIC coalescing - recap Trading CPU Cycles for Lower Latency 1 1 Network
  • 256. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 256 By default, vSphere tunes for lower CPU usage by batching I/O operations • By default, that is also the case for the RX and TX path on vNICs (here vmxnet3) • When disabled: – Every packet received interrupts immediately – Every packet will be issued immediately Virtual NIC coalescing - recap Trading CPU Cycles for Lower Latency 1 1 Network
  • 257. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 257 By default, vSphere tunes for lower CPU usage by batching I/O operations • By default, that is also the case for the RX and TX path on vNICs (here vmxnet3) • When disabled: – Every packet received interrupts immediately – Every packet will be issued immediately Virtual NIC coalescing - recap Trading CPU Cycles for Lower Latency 1 2 3 4 5 6 7 8 9 .. .. .. 1 2 3 4 5 6 7 8 9 .. .. .. Network
  • 258. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 258 Possible Latency Optimizations Network latency optimization on the VM level Network
  • 259. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 259 Disable LRO (Large Receive Offload) • Host wide: “Net.Vmxnet3SwLRO = false” • Small packets are no longer concatenated into larger ones Possible Latency Optimizations Network latency optimization on the VM level Network
  • 260. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 260 Disable LRO (Large Receive Offload) • Host wide: “Net.Vmxnet3SwLRO = false” • Small packets are no longer concatenated into larger ones Disable (vNIC) coalescing • VMX option: “ethernetX.coalescingScheme = disabled” • Issue TX immediately and immediately interrupt on RX Possible Latency Optimizations Network latency optimization on the VM level Network
  • 261. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 261 Disable LRO (Large Receive Offload) • Host wide: “Net.Vmxnet3SwLRO = false” • Small packets are no longer concatenated into larger ones Disable (vNIC) coalescing • VMX option: “ethernetX.coalescingScheme = disabled” • Issue TX immediately and immediately interrupt on RX Disable Dynamic queueing • NetQueue feature, load balances and combines less used queues • Disabling guarantees a single queue for the VM Possible Latency Optimizations Network latency optimization on the VM level Network
  • 262. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Network – Recommendations Use vmxnet3 Guest Network Driver Very efficient and required for maximum performance= Evaluate Disabling Interrupt Coalescing Default mechanism may induce small amounts of latency in favor of throughout It’s a 10Gb+ World 1Gb saturation is real, more bandwidth required today, especially in light of vSAN, MonsterVM vMotion Use Latency Sensitivity High ‘Cautiously’ While it can reduce latency and jitter in the 10us use case, it comes at a cost with core reservations, etc Requires FULL CPU and MEM reservation – or it won’t work and won’t tell you
  • 263. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Herculean Storage IO • More than 1 Million IOPs from 1 VM Hypervisor: vSphere 5.1 Server: HP DL380 Gen8 CPU: 2 x Intel Xeon E5-2690, HT disabled Memory: 256GB HBAs: 5 x QLE2562 Storage: 2 x Violin Memory 6616 Flash Arrays VM: Windows Server 2008 R2, 8 vCPUs and 48GB. Iometer Config: 4K IO size w/ 16 workers Reference: http://blogs.vmware.com/performance/2012/08/1millioniops-on-1vm.html
  • 264. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Bare-metal to virtual TPC-C* gap then and now(ish) * Non-complaint, fair-use implementation of the workload on Oracle 12c. Not comparable to official results.
  • 265. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Bare-metal to virtual TPC-C* gap then and now(ish) * Non-complaint, fair-use implementation of the workload on Oracle 12c. Not comparable to official results. - 30 %
  • 266. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Bare-metal to virtual TPC-C* gap then and now(ish) * Non-complaint, fair-use implementation of the workload on Oracle 12c. Not comparable to official results. - 30 % - 10%
  • 267. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Scaling out vs. up on the same host to amortize overhead 1416.37 0 200 400 600 800 1000 1200 1400 1600 Baremetal tpsE Throughput Score TPC-E on native HP Proliant DL 385 G8 http://blogs.vmware.com/vsphere/2013/09/worlds-first-tpc-vms-benchmark-result.html http://www.tpc.org/4064 / http://www.tpc.org/5201 470.31 468.11 457.55 0 200 400 600 800 1000 1200 1400 1600 Virtual tpsE of 3 VMs running TPC-VMS Throughput Score TPC-VMS on virtualized HP Proliant DL 385 G8 VM3 VM2 VM1
  • 268. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Storage I/O latencies are higher in virtual The Problem - with Database Logs
  • 269. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Storage I/O latencies are higher in virtual Usually not a noticeable problem for Data IO • Long (5+ ms) latency on HDDs • Random I/O, Many threads banging on the same spindle(s) • Even some SSDs are ~1ms The Problem - with Database Logs
  • 270. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Storage I/O latencies are higher in virtual Usually not a noticeable problem for Data IO • Long (5+ ms) latency on HDDs • Random I/O, Many threads banging on the same spindle(s) • Even some SSDs are ~1ms Not OK for Redo Log access The Problem - with Database Logs
  • 271. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Storage I/O latencies are higher in virtual Usually not a noticeable problem for Data IO • Long (5+ ms) latency on HDDs • Random I/O, Many threads banging on the same spindle(s) • Even some SSDs are ~1ms Not OK for Redo Log access • Short (<<1ms latency) • Sequential I/O, Single-threaded, Write-Only • Typically a write-back cache in the HBA or the array • Check the Top 5 wait events in Oracle AWR or equivalent database health reports The Problem - with Database Logs
  • 272. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. The Solution - Trade CPU Cycles for Lower Latency
  • 273. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. By default, vSphere tunes for lower CPU usage by batching I/O operations The Solution - Trade CPU Cycles for Lower Latency
  • 274. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. By default, vSphere tunes for lower CPU usage by batching I/O operations But when sensing low IOPS, vSphere stops batching and switches to low latency mode The Solution - Trade CPU Cycles for Lower Latency
  • 275. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. By default, vSphere tunes for lower CPU usage by batching I/O operations But when sensing low IOPS, vSphere stops batching and switches to low latency mode • For lowest latency, put the log device on a vSCSI adapter by itself • Batching and coalescing is on a per-vSCSI bus, not device(!) basis • Explicit tuning can prove more effective though The Solution - Trade CPU Cycles for Lower Latency
  • 276. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Explicit workaround on the issuing path: • Default is Asynchronous request passing from vSCSI adapter to VMKernel – But dynamically adjust for low IOPS case The Solution - Trade CPU Cycles for Lower Latency
  • 277. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Explicit workaround on the issuing path: • Default is Asynchronous request passing from vSCSI adapter to VMKernel – But dynamically adjust for low IOPS case • To explicitly force immediate initiation of I/O operation (sync) – scsiNNN.reqCallThreshold = “1” The Solution - Trade CPU Cycles for Lower Latency
  • 278. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Explicit workaround on the issuing path: • Default is Asynchronous request passing from vSCSI adapter to VMKernel – But dynamically adjust for low IOPS case • To explicitly force immediate initiation of I/O operation (sync) – scsiNNN.reqCallThreshold = “1” Explicit workaround on the completion path: • Default is coalescing of Virtual Interrupts – vSphere automatically suspends interrupt coalescing for low IOPS workloads The Solution - Trade CPU Cycles for Lower Latency
  • 279. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Explicit workaround on the issuing path: • Default is Asynchronous request passing from vSCSI adapter to VMKernel – But dynamically adjust for low IOPS case • To explicitly force immediate initiation of I/O operation (sync) – scsiNNN.reqCallThreshold = “1” Explicit workaround on the completion path: • Default is coalescing of Virtual Interrupts – vSphere automatically suspends interrupt coalescing for low IOPS workloads • Or explicitly disable Virtual Interrupt Coalescing – For PVSCSI: scsiNNN.intrCoalescing = “False” – For other vHBAs: scsiNNN.ic = “False” The Solution - Trade CPU Cycles for Lower Latency
  • 280. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. VMFS on par or faster than RDM (approx. 1%) Reference: http://www.vmware.com/techpapers/2017/sql-server-vsphere65-perf.html Myth Revisited: RDM versus VMFS
  • 281. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. Storage – Recommendations Use Multiple vSCSI Adapters Allows for more queues and I/O’s in flight Use pvscsi vSCSI Adapter More efficient I/O’s per cycle Don’t Use RDM’s Unless needed for shared disk clustering, no longer a performance advantage VMware Snapshots Should Be ‘Temporary’ Despite constant performance improvements, snapshots should not live forever, Co-Stop, Syncronous Leverage Your Storage OEM’s Integration Guide They provide necessary guidance around items like multi-pathing
  • 282. 282 DOAG 2020 │ ©2020 VMware, Inc. vMotion
  • 283. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 283 vMotion Workflow vMotion Network Datastore Source ESXi Host Destination ESXi Host
  • 284. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 284 vMotion Workflow Create VM on Destination 1 vMotion Network Datastore Source ESXi Host Destination ESXi Host
  • 285. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 285 Copy Memory vMotion Workflow Create VM on Destination 1 2 vMotion Network Datastore Source ESXi Host Destination ESXi Host
  • 286. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 286 Quiesce VM on Source Copy Memory vMotion Workflow Create VM on Destination 1 2 3 vMotion Network Datastore Source ESXi Host Destination ESXi Host
  • 287. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 287 Quiesce VM on Source Copy Memory vMotion Workflow Create VM on Destination 1 2 3 Transfer Device State 4 vMotion Network Datastore Source ESXi Host Destination ESXi Host
  • 288. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 288 Quiesce VM on Source Copy Memory vMotion Workflow Create VM on Destination 1 2 3 Transfer Device State Resume VM on Destination 4 5 vMotion Network Datastore Source ESXi Host Destination ESXi Host
  • 289. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 289 Quiesce VM on Source Copy Memory vMotion Workflow Create VM on Destination 1 2 3 Transfer Device State Resume VM on Destination 4 5 vMotion Network Datastore Source ESXi Host Destination ESXi Host Execution Switchover Time of 1 sec
  • 290. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 290 Quiesce VM on Source Copy Memory vMotion Workflow Create VM on Destination 1 2 3 Transfer Device State Resume VM on Destination Power Off VM on Source 4 5 6 vMotion Network Datastore Source ESXi Host Destination ESXi Host Execution Switchover Time of 1 sec
  • 291. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 291 Memory Copy Source VM Memory Destination VM Memory Phase 0: Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB Iterative Memory Pre-Copy
  • 292. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 292 Memory Copy Source VM Memory Destination VM Memory Phase 0: Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB Iterative Memory Pre-Copy
  • 293. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 293 Memory Copy Source VM Memory Destination VM Memory Phase 0: Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB Phase 1: Retransmit the dirtied 10GB. In the process, the VM dirties another 3GB Iterative Memory Pre-Copy
  • 294. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 294 Memory Copy Source VM Memory Destination VM Memory Phase 0: Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB Phase 1: Retransmit the dirtied 10GB. In the process, the VM dirties another 3GB Phase 2: Send the 3GB. While that transfer is happening, the VM dirties 1GB Iterative Memory Pre-Copy
  • 295. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 295 Memory Copy Source VM Memory Destination VM Memory Phase 0: Copy the VM’s 40GB of memory, trace pages. As we send that memory, the VM dirties 10GB Phase 1: Retransmit the dirtied 10GB. In the process, the VM dirties another 3GB Phase 2: Send the 3GB. While that transfer is happening, the VM dirties 1GB Phase 3: Send the remaining 1GB Iterative Memory Pre-Copy
  • 296. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 296 vMotion of Oracle RAC It’s been working for a while …
  • 297. 297 Confidential │ ©2018 VMware, Inc. pre 6.5* Trace Cost LP remap Prealloced memory RDTSC cost (SDPS) Common Issues for Monster VMs
  • 298. ‹#› 298 Confidential │ ©2018 VMware, Inc. - use ESXi 6.5 - use multi NIC (10Gb+!)
  • 299. 299 DOAG 2020 │ ©2020 VMware, Inc. Performance Troubleshooting
  • 300. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 300 How to troubleshoot any issue No matter how complicated
  • 301. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 301 1. Identify a related system or component that your team is not responsible for How to troubleshoot any issue No matter how complicated
  • 302. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 302 1. Identify a related system or component that your team is not responsible for 2. Hypothesize that the issue is with that component How to troubleshoot any issue No matter how complicated
  • 303. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 303 1. Identify a related system or component that your team is not responsible for 2. Hypothesize that the issue is with that component 3. Assign the issue to the responsible team How to troubleshoot any issue No matter how complicated
  • 304. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 304 1. Identify a related system or component that your team is not responsible for 2. Hypothesize that the issue is with that component 3. Assign the issue to the responsible team 4. When proven wrong, go to 1. How to troubleshoot any issue No matter how complicated
  • 305. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 305 Tuning guide for a completely different system Some advanced option found on a blog Vaguely fitting KB etc. Perfectly valid methods to “troubleshoot” or “tune” /s
  • 306. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 306 The biggest enemy "XY Problem" 1. I have problem X 1. I have problem Y Y
  • 307. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 307 The biggest enemy "XY Problem" 1. I have problem X 1. I have problem Y 2. Help me solve problem Y Y
  • 308. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 308 The biggest enemy "XY Problem" 1. I have problem X 1. I have problem Y 2. Help me solve problem Y 3. Hey! I still have a problem Y ?
  • 309. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 309 The biggest enemy "XY Problem" 1. I have problem X 2. I think it is because of Y 3. I have problem Y 4. Help me solve problem Y 5. Hey! I still have a problem Y ?
  • 310. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 310 The biggest enemy "XY Problem" 1. I have problem X 2. I think it is because of Y 3. I have problem Y 4. Help me solve problem Y 5. Hey! I still have a problem X Y ?
  • 311. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 311 The biggest enemy "XY Problem" 1. I have problem X 2. I think it is because of Y 3. I have problem Y 4. Help me solve problem Y 5. Hey! I still have a problem tl;dr don’t jump to conclusions X Y ? !
  • 312. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 312 Where to use caution Believing anybody
  • 313. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 313 Where to use caution Believing anybody “Trust, but verify.“*
  • 314. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 314 Where to use caution Believing anybody * From the Russian proverb: "Доверяй, но проверяй" {Doveryai, no proveryai} “Trust, but verify.“*
  • 315. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 315 Where to use caution Comparing hosts, past and present, etc. !=
  • 316. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 316 Don’t assume newer == better Where to use caution Comparing hosts, past and present, etc. !=
  • 317. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 317 Don’t assume newer == better Identify all differences Where to use caution Comparing hosts, past and present, etc. !=
  • 318. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 318 Where to use caution Relying on Traffic Light Dashboards alone
  • 319. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 319 All metrics green? Where to use caution Relying on Traffic Light Dashboards alone
  • 320. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 320 All metrics green? → All good then! (false negative) Where to use caution Relying on Traffic Light Dashboards alone
  • 321. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 321 All metrics green? → All good then! (false negative) Some metrics red? Where to use caution Relying on Traffic Light Dashboards alone
  • 322. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 322 All metrics green? → All good then! (false negative) Some metrics red? → Something must be broken! (false positive) Where to use caution Relying on Traffic Light Dashboards alone
  • 323. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 323 Where to use caution Working through a list of known issues Very good to start with! • Don’t spend more than half and hour
  • 324. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 324 Where to use caution Working through a list of known issues Very good to start with! • Don’t spend more than half and hour Can be from different perspectives • Application • Resources, e.g.: – CPU contention – Memory pressure – Disk latency – Etc.
  • 325. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 325 Apply different methodologies as needed e.g. directionally
  • 326. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 326 Apply different methodologies as needed e.g. directionally Top → Down: drill down from the application / its metrics • app specific / difficult to "profile" the whole path
  • 327. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 327 Apply different methodologies as needed e.g. directionally Top → Down: drill down from the application / its metrics • app specific / difficult to "profile" the whole path Bottom → Up: investigate from the resource point of view • easy to run into false positives / not all resources evenly covered
  • 328. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 328 Apply different methodologies as needed e.g. directionally Top → Down: drill down from the application / its metrics • app specific / difficult to "profile" the whole path Bottom → Up: investigate from the resource point of view • easy to run into false positives / not all resources evenly covered Recommendation: Bottom Up Checklist first
  • 329. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 329 What makes you think there is a performance issue Has it ever performed well What has changed since Can it be quantified What else is affected What is the timing Is it reproducible etc. Ask questions Good ones, preferably
  • 330. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 330 Take notes along the way seriously
  • 331. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 331 Take notes along the way seriously "Remember kids, the only difference between science and screwing around is writing it down."
  • 332. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 332 Provide an exact timeline Part of notetaking but often forgotten 2017-11-28 23:00 UTC Upgrade 2017-11-29 07:00 UTC Issue first noticed 2017-11-29 > 23:59 UTC Tried everything under the sun and wrote down nothing 2017-11-30 08:00 Called GSS
  • 333. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 333 Be accurate and universal https://xkcd.com/1179/
  • 334. 334 DOAG 2020 │ ©2020 VMware, Inc. SR examples “The case of the unexplained …”
  • 335. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 335 Initial SR description: • Oracle DB on virtual 64bit W2K8 three times slower than physical • on 32bit W2K8 and 32/64bit RHEL5, only 5% slower than physical • benchmarked with production equivalent test script Example 1 – Oracle DB performance Tales from GSS
  • 336. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 336 Initial SR description: • Oracle DB on virtual 64bit W2K8 three times slower than physical • on 32bit W2K8 and 32/64bit RHEL5, only 5% slower than physical • benchmarked with production equivalent test script Troubleshooting in support: • checked logs for errors • basics like power management, limits, etc • research if similar issues have been reported Example 1 – Oracle DB performance Tales from GSS
  • 337. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 337 Example 1 – Oracle DB performance Tales from GSS
  • 338. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 338 Reproducing in-house: Example 1 – Oracle DB performance Tales from GSS
  • 339. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 339 Reproducing in-house: • the customer provided two pre-configured VMs Example 1 – Oracle DB performance Tales from GSS
  • 340. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 340 Reproducing in-house: • the customer provided two pre-configured VMs • during initial run, the 64bit VM performed worse by a factor of 3 Example 1 – Oracle DB performance Tales from GSS
  • 341. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 341 Reproducing in-house: • the customer provided two pre-configured VMs • during initial run, the 64bit VM performed worse by a factor of 3 • automated benchmark start and result collection, dropped to 1.6 on avg. Example 1 – Oracle DB performance Tales from GSS
  • 342. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 342 Example 1 – Oracle DB performance Tales from GSS
  • 343. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 343 Example 1 – Oracle DB performance Tales from GSS
  • 344. DOAG 2020 NOON2NOON │ ©2020 VMware, Inc. 344 Murphy's law strikes: Example 1 – Oracle DB performance Tales from GSS