Virtualisation Oversubscription - What's so scary?

Commercial in Confidencewww.metron-athene.com
Virtualisation Oversubscription
(What’s so scary?)
Phil.bell@metron-athene.com

Topics
• What led me here
• Oversubscription Overview
• CPU Oversubscription
• Memory Oversubscription
• What’s the worst that can happen? (Queueing theory, the simple
version)

Overcommit vs Oversubscribe
• Overcommit = Oversubscribe

What led me here
• Clients
– “Oh, we don’t oversubscribe”
• Fear
• Misunderstanding

Flying Navigation by Dead Reckoning
• You know where you started
• You know how long you flew for
• You know your air speed
• You know what direction you flew in
• What if the wind changed in the last 8 hours?
• WW2 bombing saw 1 in 5 bomb loads within 5
miles of the target.

Virtualisation Used Capacity by Dead Reckoning
• You know what you started with
• You know what you provisioned
• You know how much is left
• Not especially efficient

Oversubscription
• Allocating more than you have
– Thin Provisioning
– Deduplication & Compression
Allocated
Exists
Allocated
Exists
Allocated
Used

What can be oversubscribed?
• CPUs
• Memory
• Disk
• NICs
– Nobody ever seems to think about that one
– VMs on a single host = no NIC involved
– Otherwise…

CPU VMware Maximums
• Virtual Machine Maximum
– 128 vCPUs per VM
• Host CPU maximums
– Logical CPUs per host 480
– Virtual machines per host 1024
– Virtual CPUs per host 4096
– Virtual CPUs per core 32
• The achievable number of vCPUs per core depends on the workload
and specifics of the hardware. For more information, see the latest
version of Performance Best Practices for VMware vSphere
https://www.vmware.com/pdf/vsphere6/r60/vsphere-60-
configuration-maximums.pdf

Memory VMware Maximums
• 6TB per Host
– Well 12TB on specific hardware
• 4TB per VM

Memory Oversubscription
• How?
– Free Space
– Page Sharing
– Balloon Driver (VMware)
– Reservations
– Shares

Memory
• Transparent Page Sharing
– Deduplication in memory
• Balloon Driver
– Vmmemctl process “steals” memory inside the VM allowing that
memory to be used by other VMs. This may cause the OS to page.
• VMkernel Swap
– VM thinks pages are in memory. ESX has put that memory on disk
in a Vmkernel Swap file.
– “Performance is NOT optimal”

Transparent Page Sharing
VM1 VM2
ESX

Balloon Driver (vmmemctl)
VM1 VM2
ESX

Memory test
• Memory vs. disk speed is…?
– A) Memory is 100x faster than disk
– B) Memory is 1,000x faster than disk
– C) Memory is 10,000x faster than disk
– D) Memory is 100,000x faster than disk
– E) Memory is 1,000,000x faster than disk
– F) I have no memory of the event, your honour

VMkernel Swap
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Balloon
Swap File
Reservation MB
Example:
• Assume maximum memory
contention
• Default 65% can be Balloon
driver
• Example Reservation is 30%
• 5% In the VMkernel (.vswp)
file.

Memory
433MB Active
Memory
2.6GB Unique
Memory
1.4GB Shared
Memory
50MB Balloon
Driver Memory
150MB ESX
Overhead for
the VM

Reservations
• Resource Pools or VMs
• If they want it, they get it
• If they don’t want it, it’s available to all
• Cannot reserve more than exists
• Oversubscribe
– Protect core VMs with a reservation

Memory Idle Tax
• Memory has Shares
• Memory Tax associates a value to each page used
• Default Idle Tax rate is 75%
• This makes idle memory cost 4 times as many shares as active
memory

CPU Oversubscription
• How?
– Time slicing
– Co-Scheduling
– Reservations
– Shares
– Limits

Time Slicing
• Cores are shared between vCPUs in time slices
– 1 vCPU to 1 core at any point in time
• More vCPUs = More time slicing
• Processes do this on CPUs all the time
– So why it is so scary?
– Over 100 processes on my laptop share 4 CPUs
Running Dormant/IdleVM1
VM1

IdleReadyThreads
VMWare Processor Scheduling:
vCPU Co-Scheduling & Ready Time
1
2
3
4
VM
VM
VM
VM
VM
VM
VM
VM
VM

ReservationsSharesLimits

Reservations
Prod VM
Reservation
CPUUsedbyProductionVM
CPU Used by Test VM
1)The Production VM wants to use all the CPU
available.
2)The Test VM starts and also wants to use all
the CPU available.
3)Each uses 50% CPU
4)The Production VM wants 250MHz CPU
while Test wants to use 4000MHz CPU.
Production gets 100% of it’s request. Test does
not.
100% CPU
100% CPU
0% CPU
50% CPU
50%CPU

Reservations & Shares
Prod VM
Reservation
CPUUsedbyProductionVM
CPU Used by Test VM
1)The Production VM (2000 Shares) wants to
use all the CPU available.
2)The Test VM (1000 Shares) also wants to use
all the CPU available.
3)Production gets 66% CPU, Test gets 33%
CPU.
4)The Production VM wants 250MHz CPU
while Test could still use 4000MHz CPU.
Production gets 100% of it’s request. Test does
not.
100% CPU
100% CPU
0% CPU 33% CPU
66%CPU

Expandable Reservation 1
Root (RP)
Total CPU: 10200 MHz
Software (RP)
Reservation: 3000 MHz
Expandable : Yes
Production (RP)
Expandable : Yes
Test (RP)
Expandable : No
VM1
Res: 400 MHz
VM2
Res: 300 MHz
VM7
Res: 500 MHz
Why Cant VM7 Start?
1200 MHz Required.
1000 MHz Available.

Expandable Reservation 2
Root (RP)
Total CPU: 10200 MHz
Software (RP)
Expandable : Yes
Production (RP)
Expandable : Yes
Test (RP)
Expandable : Yes
VM1
Res: 400 MHz
VM2
Res: 300 MHz
VM7
Res: 500 MHz
VM3
Res: 500 MHz
VM4
Res: 500 MHz
VM5
Res: 500 MHz
VM6
Res: 500 MHz
2000MHz Requested
1200MHz Reservation
2000MHz of Parent Used
1200MHz Requested
1000MHz Available In Parent
Where is the “extra” taken from?
3200MHz Requested
3000MHz Reservation
200MHz used byTest (RP)

What’s the worst that can happen?
• Memory
• It fills up
• Then bad things happen
• CPU
• Bad things happen
• Then it’s full/maxed
• Queueing Theory

Contention and Queuing
• Finite system resources
• Single workstation = no contention (usually)
• More than One User = Possible Contention
• Contention = Queuing
– This is COMPLETELY NORMAL
– It’s how operating systems work.
• Excessive Queuing = Poor Performance and
Long Response Times

Basic Ideas of Queuing
Queue Server
Arriving customers,
transactions
A
Leaving customers,
transactions
L
Queuing Time
Q
Service Time
S
Response Time

Utilization and Response Time
Response Time
0 0.5 1.0
Utilization
Service
Time
R = S / (1 - U)

Benefits of Multiple Servers
Response Time
0 0.5 1.0
Utilization
Service
Time
Single CPU
Dual CPU
16-way CPU

Why are we interested in this queue stuff again?
• VMs Queue for free CPUs
– Ready Time
– Co-Stop time
– Higher utilisation = higher contention
– More concerned about CPU busy than vCPU to logical CPU ratio
– Because it’s maths, you can model it

Roundup
• Oversubscription does not equal unacceptable performance
• Virtualisation is expecting you to oversubscribe
– It’s the reason it exists
• Take the fear out of oversubscription through proper planning
– Plan for performance, not ratios

Thank You
www.metron-athene.com
Phil.bell@metron-athene.com

Virtualisation Oversubscription - What's so scary?

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to Virtualisation Oversubscription - What's so scary?

Similar to Virtualisation Oversubscription - What's so scary? (20)

More from Metron

More from Metron (15)

Recently uploaded

Recently uploaded (20)

Virtualisation Oversubscription - What's so scary?

Editor's Notes