PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Application Performance in
Virtualized Environments
Todor Tsankov,
Cloud Service Engineer

Agenda
1 CPU Resources
2 Memory Resources
3 Storage Resources
4 Network Resources
2

CPU - Features
• Fair proportional scheduling:
– Shares
– Limit
– Reservation
• vSMP / Co-scheduling
• Hypethreading
• Intel VT-x / AMD-V
• NUMA / prefer-HT
• vNUMA
4

CPU – NUMA and Hyperthreading
7

CPU - Monitoring
• Ready (%RDY)
– % time a vCPU was ready to be scheduled on a physical processor but couldn’t due to processor
contention
– Investigation Threshold: 10% per vCPU
• Co-Stop (%CSTP)
– % time a vCPU in an SMP virtual machines is “stopped” from executing, so that another vCPU in the
same virtual machine could run to “catch-up” and make sure the skew between the two virtual
processors doesn’t grow too large
– Investigation Threshold: 3%
8

CPU – Best Practices
• Do not over-allocate vCPUs
• Create single vCPU VMs whenever possible
• Enable Hyperthreading
• Right Size the VM
– vCPU count should be less or equal to the number of cores in a single physical CPU (single NUMA
node)
9

Memory - Features
• Allow memory over-commitment
• Fair proportional memory scheduling
– Scheduling parameters:
• Shares
• Limit
• Reservation
• Support for large pages
– Performance increased by 10 to 30%
• Intel EPT / AMD RVI
11

Memory - Mapping
• Three types of memory address spaces
– Virtual memory
– Physical memory
– Machine memory
12

Memory - Mapping
• Hardware accelerated virtualization (Intel EPT / AMD RVI)
– Handle shadow mapping in the hardware
– Tagged Translation Look-aside Buffers (TLB)
13

Memory - Reclamation
• Transparent page sharing
– most efficient
• Memory Ballooning
– always install latest version of VMware tools
• Memory Compression
– may sound strange, but this is much faster than swapping
• Virtual Machines Swap
– Hypervisor swap, not to be confused with OS swap file/partition
14

Memory - Reclamation - Transparent Page Sharing (TPS)
Background process for removing duplicate memory pages
15

Memory – Reclamation - Ballooning
“Pushes” memory pressure from ESX host into VM
16

Memory - Reclamation - Compression
Essentially “zips” memory instead of swapping it so that it uses less space in RAM
17

Memory - Reclamation - Swapping
Writes VM memory from physical RAM out to disk
18

Memory - Monitoring
• Balloon driver size (MCTLSZ)
– The total amount of guest physical memory reclaimed by the balloon driver
– Investigation Threshold: 1
• Swapping (SWCUR)
– The current amount of guest physical memory that is swapped out to the ESX kernel VM swap file
• Swap Reads/sec (SWR/s)
– The rate at which machine memory is swapped in from disk
• Swap Writes/sec (SWW/s)
– the rate at which machine memory is swapped out to disk
19

Memory – Best Practices
• Do not overcommit memory
• Configure swap in your Guest Operating System
– Size it to be at least equal to the configured vRAM for the VM
– Put the swapping partition or swap file (for Windows) in separate virtual disk
• Install VMware tools
– This enables the ballooning driver and enables the VMkernel to use the best memory reclamation
technique
• Enable Intel EPT / AMD RVI in the ESX host BIOS
• Use large memory pages in guest OS
– Minimizes the TLB misses
20

Storage – Monitoring
• Kernel Latency Average (KAVG)
– This counter tracks the latencies of IO passing thru the Kernel
– Investigation Threshold: 1ms
• Device Latency Average (DAVG)
– This is the latency seen at the device driver level. It includes the roundtrip thime between the HBA and
the storage
– Investigation Threshold: 15-20ms, lower is better, some spikes are okay
• Abort (ABRT/s)
– The number of commands aborted per second
23

Storage – Best Practices
• Separate VM disk on different physical disks if needed
• Do not oversize VM disks
– VM disk can be expanded, but it is difficult to shrink
• Preprovision VM disks
– Don’t use thin provisioned disk for mission critical applications
• Install VMware Tools
– Installs optimized, specific OS drivers for the SCSI controllers
• Align guest OS disks
– Most modern OS does this automatically
24

Network - Monitoring
• Transmit Dropped Packets (%DRPTX)
– The percentage of transmit packets dropped
• Receive Dropped Packets (%DRPRX)
– The percentage of received packets dropped
27

Network - Best Practices
• Load balance on vSwitch level, not inside VM
– Allow the Hypervisor to do the network teaming
• Install VMware Tools
– Installs optimized, specific OS drivers for the NIC adapters
• Use VMXNET3 vNIC adapters when possible
– Support for most modern OS
28

PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Similar to PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov (20)

More from PlovDev Conference

More from PlovDev Conference (7)

Recently uploaded

Recently uploaded (20)

PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Editor's Notes