SlideShare a Scribd company logo
1 of 54
VMware Performance
  Troubleshooting

 Presented by Chris Kranz
Topics Covered
•   Introduction
•   Root Cause Analysis
•   Performance Characteristics
•   CPU
•   Networking
•   Memory
•   Disk
•   Virtual Machine optimisation
•   ESXTop
•   vm-support
•   Service Console
•   Resource Groups
•   Design Guidelines
•   Capacity Planner limitations and cautions
•   Conclusion
•   Reference Articles
Introduction
Multiple layers of virtualisation are used to
increase service levels, availability and
manageability

However, multiple layers of virtualisation often
mask performance and configuration issues
making it more of a challenge to troubleshoot
and correct

The worst out come is that performance issues
after a virtualisation project lead to the
perception that VMware results in reduced
performance and future confidence in VMware
can be affected
Performance Basics


• Virtual Machine Resources
  – CPU
  – Memory
  – Disk
  – Networking
Resource Maximums

                           Host           Guest
  Logical Processors       64              N/A
  Virtual CPUs             N/A              8
  Virtual CPU’s per Core   20              N/A
  Memory                   1TB            256GB

http://www.vmware.com/pdf/vsphere4/r40/vsp_40_config_max.pdf
Typical Host

                                       vSphere 1U Host
  CPU’s                                 2 x Quad Core
  Memory                                 32-64GB RAM


      Typical 3 VMs per core, 24VM’s per Host
      Each has 2GB of RAM = 48GB of RAM
Root Cause Analysis




http://www.vmware.com/resources/techresources/10066
Root Cause ...
Monitoring Performance

• Do not rely on guest tools, but
  – Can show high CPU, & Memory Utilisation
  – Measurement of Latency & throughput of Disk &
    Network Interfaces
• Use the virtualisation layer, to diagnose cause:
  – Guest is unaware of virtualisation workload
  – The way in which guest OS’s account time is
    different
  – No visibility of available resources
Performance Analysis Tools

• esxtop (service console only)
• resxtop (remote command line utilities)
• Performance graphs in vCentre
esxtop

• esxtop can be run:
  – Interactively
  – Batch (eg. esxtop -a -b > analysis.csv)
  – Load batch into windows perfmon or MS Excel


• Two keys to remember
  – H : help
  – F : fields to display
esxtop basics


                                Host Resources




   Name of Resource
   Pool, Virtual      Number of Worlds
   Machine or World
Performance Characteristics




    CPU            Memory           Networking       Disk
Slow Processing   Slow Processing    Packet Loss    Log Stalls
 High CPU Wait     Disk Swapping    Slow Network   Disk Queue



                  Slow Application Performance
                    Reduced User Experience
                     Data Loss and Corruption
CPU
         ESX Scheduler

                                        Basic World States
                                        Read / Run / Wait

                                           CPU States
          Service    Virtual           Ready / Usage / Wait
          Console    Machine




      Limits / Shares / Reservations
CPU                        High %RDY + High %User can imply over commitment
esxtop
     •PCPU(%): CPU utilization
     •%USED: Utilization
     •%RDY: Ready Time
     •%RUN: Run Time
     •%WAIT: Wait and idling time
CPU
VI-Client
   Used Time > Ready Time:
   Possible CPU over-committment


                                   Used Time




                                   Ready Time
CPU
Further Investigation


                        %MLMTD shows this VM has been limited
CPU
Further Investigation




                        High ready time caused by CPU resource limit
VMware Memory Management
• Transparent Page Sharing
• VMware Tools Balloon Driver to force the VM to swap to disk
• Virtual Machine Page File
Memory
Ballooning vs. Swapping
 Ballooning driver causes the
 host to swap pages that it
 chooses to disk

 ESX Swapping will swap any
 pages to disk.
Memory

• Ballooning can be disabled (0 value) or
  controlled on a per Virtual Machine basis
  using:
  sched.mem.maxmemctl
• Default is set to 65%, can be controlled at host
  level.
• Only is an issue in resource contention
  scenarios. (or VM’s with low latency eg Citrix)
Memory - Host

VI Client shows memory usage of the host. This is calculated as “consumed + overhead
memory + Service Console”.

Performance charts are a very good way of showing the Virtual Machine memory
breakdown.

    • Consumed Memory
    • Ballooned Memory
    • Shared Memory
    • Swapped Memory
Memory - Guest

  Host Memory = Consumed + Overhead Memory
   Guest Memory = Active Memory for Guest OS
Memory – Guest Overhead
Memory          Virtual Machine Memory Metrics – VI Client

Metric                 Description
Memory Active (KB)     Physical pages touched recently by a VM
Memory Usage (%)       Active memory / configured memory
Memory Consumed (KB)   Machine memory mapped to a virtual machine, including its portion of
                       shared pages. Doesn’t include overhead memory
Memory Granted (KB)    Physical pages allocated to a virtual machine. May be less than
                       configured memory. Includes shared pages. Doesn’t include overhead
                       memory.
Memory Shared (KB)     Physical pages shared with other virtual machines
Memory Balloon (KB)    Physical memory ballooned from a virtual machine
Memory Swapped (KB)    Physical memory in swap file (approx. “swap out – swap in”). Swap out
                       and Swap in are cumulative
Overhead Memory (KB)   Machine pages used for virtualisation
Memory                              Host Memory Metrics – VI Client

Metric                  Description
Memory Active (KB)      Physical pages touched recently by the host
Memory Usage (%)        Active memory / configured memory
Memory Consumed (KB)    Total host physical memory – free memory on host. Includes Overhead
                        and Service Console memory
Memory Granted (KB)     Sum of physical pages allocated to all virtual machines. Doesn’t include
                        overhead memory.
Memory Shared (KB)      Physical pages shared by virtual machines on host
Shared Common (KB)      Total machine pages used by shared pages
Memory Balloon (KB)     Machine pages ballooned from virtual machines
Memory Swap Used (KB)   Physical memory in swap file (approx. “swap out – swap in”). Swap out
                        and Swap in are cumulative
Overhead Memory (KB)    Machine pages used for virtualisation
Memory   PMEM: Total physical memory breakdown
esxtop        VMKMEM: Memory managed by vmkernel
              COSMEM: Service Console memory breakdown
              PSHARE: Page sharing statistics
              SWAP: Swap statistics
              MEMCTL: Balloon driver data
Memory      esxtop / VI Client metrics : Virtual Machines




VI Client         esxtop
Active Memory     TCHD
Memory Usage      %ACTV
Consumed Memory   N/A
Memory Granted    N/A (SZTGT and CMTTGT represent memory scheduler targets)
Memory Shared     SHRD (+SHRDSVD per VM). Must enable COW stats in ESXTOP
Memory Balloon    MCTLSZ
Memory Swapped    SWCUR (SWR/s & SWW/s are rates)
Overhead Memory   OVHD & OVHDMAX
Memory          esxtop / VI Client metrics : Host Usage



VI Client              esxtop
Memory Active          N/A (try /proc/vmware/sched/mem-verbose)
Memory Usage           N/A (try /proc/vmware/sched/mem-verbose)
Memory Consumed        PMEM total – PMEM free
Memory Granted         N/A (SZTGT and CMTTGT represent memory scheduler targets)
Memory Shared          PSHARE (shared)
Memory Shared Common   PSHARE (common)
Memory Balloon         MEMCTL
Memory Swap Used       SWAP (r/w and w/s are rates)
Overhead Memory        OVHD & OVHDMAX
Memory
         VI Client memory usage graph
Memory
         Troubleshooting Memory usage issues
Networking
                       •Switch Assisted Teaming (IP Hash)
                       •VLAN Trunking
                       •Flow Control (full)
                       •Speed & Duplex (1000Mb / Full)
                       •Port Fast
                       •BPDU Disabled
                       •STP Disabled
                       •Link State Tracking
                       •Jumbo Frames
Network configuration is more likely to blame than resource contention
Networking
esxtop
                               Transmit and Receive in Mb/s

              Transmit and Receive in Packets
Networking
esxtop


                  Dropped Packets Transmit
                                   Drop Packets Received
Disk
 Varying Factors
    • File system performance
    • Disk subsystem configuration (SAN, NAS, iSCSI, local disk)
    • Disk caching
    • Disk formats (thick, sparse, thin)

ESX Storage Stack
   •Different latencies for different disks
   •Queuing within the kernel

                                         K: Kernel
                                         D: Device
                                         G: Guest
Disk                                     VI Client statistics

Quite Coarse Statistics
   • Disk read / write rate (KB/s)
   • Disk usage: sum of read BW and write BW (KB/s)
   • Disk read / write requests (per 20s interval)
   • Bus resets / Command aborts (per 20s interval)
   •Per LUN or aggregated stats
Disk                                    esxtop statistics
Aggregated stats similar to VI Client
    • Disk read / write per sec (READS/s, WRITES/s)
    • MB read / write per sec (MBREAD/s, MBWRTN/s)
Latency Statistics
    • Kernel Average / command (KAVG/cmd)
    • Device Average / command (DAVG/cmd)
    • Guest Average / command (GAVG/cmd)
Queuing Information
    • Adapter Queue Length (AQLEN)
    • LUN Queue Length (LQLEN)
    • VMKernel (QUED)
    • Active Queue (ACTV)
    • %Used (%USD = ACTV/LQLEN)
Disk
SAN Rough Estimates
 Purely looking at a single ESX host, roughly:
 Throughput (in MBps) = (Outstanding IOs * Block size in KB) / latency in msec

 FC, rough maximums:
 Effective Link Bandwidth = ~80/90% of Real Bandwidth
      Effective (2Gbps) = 200 – 230 MBps
      Effective (4Gbps) = 410 – 460 MBps
      Effective (8Gbps) = 820 – 920 MBps

 iSCSI / NFS / FCoE, rough maximums:
 Effective Link Bandwidth = ~70/80% of Real Bandwidth
      Effective (1GigE) = 90 – 100 MBps
      Effective (10GigE) = 900 – 1000 MBps
Disk
Desired Latency Calculations
Desired Larency in msec <= (Outstanding IOs * Block size in KB) / Throughput per host

Example:
Number of Hosts: 16
Effective Link Bandwidth: 90 MBps
Throughput per host: 90 / 16 = 5.6 MBps
Desired Latency: (32 * 32) / (5.6) = 182.86 msec


Workload                         Cached Sequential Read        Cached Sequential Write
Desired Latency (msec)                    182.86                        182.86
Observed Latency (msec)                    ~350                          ~180
Throughput Drop?                            Yes                           No
Throughput (MBps)                           ~45                          ~90
Disk          SAN Cache enabled
VI Client             High throughput




SAN Cache disabled
 Poor throughput
Disk
esxtop




             Latency is quite high



            After enabling cache,
            Latency is reduced
Virtual Machine Optimisation
 Deploy all machines from an optimised template!
• VMware tools MUST be installed
• The disks MUST be block aligned to the storage (even when using NFS and SAN)
• Where possible, always separate data disks from OS disks
• Windows performance settings should be optimised for application performance
• Guest operating system timeouts should be set as defined by the SAN vendor
• Pagefile should be separated where appropriate (this can impact VMware SRM however)
• Unused Windows services should be disabled (wireless config, print spooler, audio, etc.)
• Last access update time should be disabled (unless where required)
• Logging of the VM should be disabled (only enabled for troubleshooting)
• Remove any unused virtual hardware (floppy drives, USB, etc.)
• Disable screen savers and power saving features, including logon screen saver
• Enable Remote Desktop, avoid using the VI Client for remote administration
• Install standard applications into template (bginfo, AntiVirus, any host agents, etc)
• Multiple-CPU’s should be allocated sparingly
Virtual Machine Optimisation
Block alignment is vital to good disk performance!
Command Action
      esxtop         space    Update the display
                     ?        Show the help page
Command Options      q        quit
when inside esxtop   f/F      Add or Remove columns from the display
                     o/O      Change the order the display is sorted
                     s        change the update interval
                     #        change the number of instances to display
                     W        Write configuration to file
                     e        Expand / Rollup CPU Stats
                     V        View only VM instances
                     L        Change the length of the NAME field
                     m        Display memory statistics
                     n        Display network statistics
                     i        Display interrupt statistics
                     d        Display disk adapter statistics
                     u        Display disk device statistics
                     v        Display disk VM statistics
esxtop
Command Line Options
from the console
                       Command Action
                       -b       batch mode
                       -l       locks the objects available in the first snapshot
                       -s       enables secure mode
                       -a       show all statistics
                       -c       sets the configuration file
                       -R       enables replay mode (used with “vm-support –S”)
                       -d       sets the update interval
                       -n       runs esxtop for n iterations
esxtop




Expand the default window size for your session to get all statistics
vm-support
Creates a packaged zip file containing the following sections:
   • boot
         • contains the grub configuration
   • etc
         • contains the Console OS configuration files (cron, tcpwrappers, syslog, etc)
   • proc
         • contains much of the hardware configuration modules and variables
   • tmp
         • contains a lot of the ESX specific configuration output
   • var
         • contains log files and any core dumps
   • vmfs
         • contains the structure of the VMFS datastores
   • esx3-installation (where appropriate)
         • contains a copy if the previous esx3 configuration variables
vm-support
Using vm-support to extract performance information:

vm-support –S –d <duration> -i <interval>
<duration> and <interval> are in seconds

The output from this can then be replayed in esxtop for review after it has been
extracted.

esxtop –R <path_to_vm-support_output>
Service Console Performance
•Multiple Service Console networks – for network resiliency
•Increased Service Console memory – upto 800MB
•Use host agents supplied by your vendors
•Make storage recommended tweaks such as HBA Queue Depth
and IO timeouts
•Minimal use of the VI Client console – RDP or SSH instead
•Properly sized vCenter server – 64bit OS where possible
Resource Groups
                  Dynamically reallocate resource shares




                  Additional VM, shares allow you to over-
                  commit resources and have a graceful
                  re-allocation



                  Remove a VM and exploit extra resources
                  across all remaining VM’s
Design Guidelines
• Full Resilience / Multiple paths
• Standard configuration across all aspects (ESX, Storage, Networking, etc.)
• Standard naming conventions
• Learn from others mistakes
• Follow guidelines from vendors best-practices
• Rule out the basics before requesting support
Capacity Planner & P2V Cautions and Limitations

• Peak CPU usage can sometimes be misleading
• Back-end storage system performance
• P2V machines will require block-aligning to the storage
• P2V machines will still require guest OS optimisation
Conclusion
• Performance issues can often be traced with simple root cause
analysis using basic tools (VI Client / esxtop)
• Performance tools help diagnose issues and help rule out non-
issues
• Performance tools are useful in different contexts, not always
either/or
    • Real-time data and troubleshooting: esxtop
    • Historical data: VI Client
    • Coarse resource / cluster usage: VI Client
    • Detailed resource usage: esxtop
• Combine information from various tools to get a complete picture
• Always benchmark your systems first so you not what the optimal
performance is that you can receive
Reference Articles
•   http://www.vmware.com/pdf/esx3_memory.pdf
•   http://www.vmworld.com/docs/DOC-2370
•   http://blogs.vmware.com/performance/
•   http://communities.vmware.com/docs/DOC-5420
•   http://kb.vmware.com/kb/1008205
•   http://communities.vmware.com/community/vmtn/general/performance
•   http://www.vmware.com/products/vmmark/
•   http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf
•   http://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf
•   http://www.vmware.com/pdf/vsphere4/r40/vsp_40_resource_mgmt.pdf
•   http://www.vmware.com/pdf/GuestOS_guide.pdf
•   http://www.vmware.com/resources/techresources/10066
•   http://www.vmware.com/resources/techresources/10059
•   http://www.vmware.com/resources/techresources/10062

More Related Content

What's hot

Tuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris EnvironmentTuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris EnvironmentJignesh Shah
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuiteEDB
 
Demand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsDemand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsHwanju Kim
 
6. Live VM migration
6. Live VM migration6. Live VM migration
6. Live VM migrationHwanju Kim
 
Postgres on OpenStack
Postgres on OpenStackPostgres on OpenStack
Postgres on OpenStackEDB
 
Advanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAlan Renouf
 
Tuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy WorkloadTuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy WorkloadMarius Adrian Popa
 
CPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop InfrastructureCPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop InfrastructureHwanju Kim
 
Virtual Infrastructure Disaster Recovery
Virtual Infrastructure Disaster RecoveryVirtual Infrastructure Disaster Recovery
Virtual Infrastructure Disaster RecoveryDavoud Teimouri
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQLJignesh Shah
 
12 christian ferber xen_server_advanced
12 christian ferber xen_server_advanced12 christian ferber xen_server_advanced
12 christian ferber xen_server_advancedDigicomp Academy AG
 
Planning & Best Practice for Microsoft Virtualization
Planning & Best Practice for Microsoft VirtualizationPlanning & Best Practice for Microsoft Virtualization
Planning & Best Practice for Microsoft VirtualizationLai Yoong Seng
 
PostgreSQL and Linux Containers
PostgreSQL and Linux ContainersPostgreSQL and Linux Containers
PostgreSQL and Linux ContainersJignesh Shah
 
Scott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceScott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceNordic Infrastructure Conference
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerYongseok Oh
 
Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)Yury Kaliaha
 

What's hot (20)

Tuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris EnvironmentTuning DB2 in a Solaris Environment
Tuning DB2 in a Solaris Environment
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster Suite
 
XS Boston 2008 Memory Overcommit
XS Boston 2008 Memory OvercommitXS Boston 2008 Memory Overcommit
XS Boston 2008 Memory Overcommit
 
Demand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsDemand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMs
 
XS Boston 2008 Quantitative
XS Boston 2008 QuantitativeXS Boston 2008 Quantitative
XS Boston 2008 Quantitative
 
6. Live VM migration
6. Live VM migration6. Live VM migration
6. Live VM migration
 
Postgres on OpenStack
Postgres on OpenStackPostgres on OpenStack
Postgres on OpenStack
 
Advanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtopAdvanced performance troubleshooting using esxtop
Advanced performance troubleshooting using esxtop
 
Tuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy WorkloadTuning Linux Windows and Firebird for Heavy Workload
Tuning Linux Windows and Firebird for Heavy Workload
 
How swift is your Swift - SD.pptx
How swift is your Swift - SD.pptxHow swift is your Swift - SD.pptx
How swift is your Swift - SD.pptx
 
CPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop InfrastructureCPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop Infrastructure
 
Virtual Infrastructure Disaster Recovery
Virtual Infrastructure Disaster RecoveryVirtual Infrastructure Disaster Recovery
Virtual Infrastructure Disaster Recovery
 
Xen Memory Management
Xen Memory ManagementXen Memory Management
Xen Memory Management
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQL
 
12 christian ferber xen_server_advanced
12 christian ferber xen_server_advanced12 christian ferber xen_server_advanced
12 christian ferber xen_server_advanced
 
Planning & Best Practice for Microsoft Virtualization
Planning & Best Practice for Microsoft VirtualizationPlanning & Best Practice for Microsoft Virtualization
Planning & Best Practice for Microsoft Virtualization
 
PostgreSQL and Linux Containers
PostgreSQL and Linux ContainersPostgreSQL and Linux Containers
PostgreSQL and Linux Containers
 
Scott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceScott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilience
 
Revisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS SchedulerRevisiting CephFS MDS and mClock QoS Scheduler
Revisiting CephFS MDS and mClock QoS Scheduler
 
Distributed Caching Essential Lessons (Ts 1402)
Distributed Caching   Essential Lessons (Ts 1402)Distributed Caching   Essential Lessons (Ts 1402)
Distributed Caching Essential Lessons (Ts 1402)
 

Similar to Vmwareperformancetroubleshooting 100224104321-phpapp02

vSphere APIs for performance monitoring
vSphere APIs for performance monitoringvSphere APIs for performance monitoring
vSphere APIs for performance monitoringAlan Renouf
 
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...PlovDev Conference
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
Presentation v mware performance overview
Presentation   v mware performance overviewPresentation   v mware performance overview
Presentation v mware performance overviewsolarisyourep
 
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...The Linux Foundation
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualizationSisimon Soman
 
VMWare Performance Tuning by Virtera (Jan 2009)
VMWare Performance Tuning by  Virtera (Jan 2009)VMWare Performance Tuning by  Virtera (Jan 2009)
VMWare Performance Tuning by Virtera (Jan 2009)vmug
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Amazon Web Services
 
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server VMworld
 
VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash VMworld
 
Spectrum Scale Memory Usage
Spectrum Scale Memory UsageSpectrum Scale Memory Usage
Spectrum Scale Memory UsageTomer Perry
 
Get Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI ServersGet Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI ServersUnidesk Corporation
 
ChinaNetCloud - Zabbix Monitoring System Overview
ChinaNetCloud - Zabbix Monitoring System OverviewChinaNetCloud - Zabbix Monitoring System Overview
ChinaNetCloud - Zabbix Monitoring System OverviewChinaNetCloud
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWSAmazon Web Services Korea
 
VMworld 2013: Performance and Capacity Management of DRS Clusters
VMworld 2013: Performance and Capacity Management of DRS Clusters VMworld 2013: Performance and Capacity Management of DRS Clusters
VMworld 2013: Performance and Capacity Management of DRS Clusters VMworld
 
Virtualization overheads
Virtualization overheadsVirtualization overheads
Virtualization overheadsSandeep Joshi
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructurexKinAnx
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructuresolarisyourep
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward
 

Similar to Vmwareperformancetroubleshooting 100224104321-phpapp02 (20)

VDI Design Guide
VDI Design GuideVDI Design Guide
VDI Design Guide
 
vSphere APIs for performance monitoring
vSphere APIs for performance monitoringvSphere APIs for performance monitoring
vSphere APIs for performance monitoring
 
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
Presentation v mware performance overview
Presentation   v mware performance overviewPresentation   v mware performance overview
Presentation v mware performance overview
 
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
XPDS13: Performance Evaluation of Live Migration based on Xen ARM PVH - Jaeyo...
 
VDI storage and storage virtualization
VDI storage and storage virtualizationVDI storage and storage virtualization
VDI storage and storage virtualization
 
VMWare Performance Tuning by Virtera (Jan 2009)
VMWare Performance Tuning by  Virtera (Jan 2009)VMWare Performance Tuning by  Virtera (Jan 2009)
VMWare Performance Tuning by Virtera (Jan 2009)
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
 
VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash VMworld 2013: Extreme Performance Series: Storage in a Flash
VMworld 2013: Extreme Performance Series: Storage in a Flash
 
Spectrum Scale Memory Usage
Spectrum Scale Memory UsageSpectrum Scale Memory Usage
Spectrum Scale Memory Usage
 
Get Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI ServersGet Your GeekOn with Ron - Session One: Designing your VDI Servers
Get Your GeekOn with Ron - Session One: Designing your VDI Servers
 
ChinaNetCloud - Zabbix Monitoring System Overview
ChinaNetCloud - Zabbix Monitoring System OverviewChinaNetCloud - Zabbix Monitoring System Overview
ChinaNetCloud - Zabbix Monitoring System Overview
 
20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS20160503 Amazed by AWS | Tips about Performance on AWS
20160503 Amazed by AWS | Tips about Performance on AWS
 
VMworld 2013: Performance and Capacity Management of DRS Clusters
VMworld 2013: Performance and Capacity Management of DRS Clusters VMworld 2013: Performance and Capacity Management of DRS Clusters
VMworld 2013: Performance and Capacity Management of DRS Clusters
 
Virtualization overheads
Virtualization overheadsVirtualization overheads
Virtualization overheads
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 

More from Suresh Kumar

Vsphere 4-partner-training180
Vsphere 4-partner-training180Vsphere 4-partner-training180
Vsphere 4-partner-training180Suresh Kumar
 
Vsphere4 100325065654-phpapp01
Vsphere4 100325065654-phpapp01Vsphere4 100325065654-phpapp01
Vsphere4 100325065654-phpapp01Suresh Kumar
 
Vmwareserver tips-tricks-110218231744-phpapp01
Vmwareserver tips-tricks-110218231744-phpapp01Vmwareserver tips-tricks-110218231744-phpapp01
Vmwareserver tips-tricks-110218231744-phpapp01Suresh Kumar
 
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Suresh Kumar
 
Managingvspherewiththevesi 091210144626-phpapp02
Managingvspherewiththevesi 091210144626-phpapp02Managingvspherewiththevesi 091210144626-phpapp02
Managingvspherewiththevesi 091210144626-phpapp02Suresh Kumar
 
Advancedtroubleshooting 101208145718-phpapp01
Advancedtroubleshooting 101208145718-phpapp01Advancedtroubleshooting 101208145718-phpapp01
Advancedtroubleshooting 101208145718-phpapp01Suresh Kumar
 
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Suresh Kumar
 
Vstoragetamsupportday1 110311121032-phpapp02
Vstoragetamsupportday1 110311121032-phpapp02Vstoragetamsupportday1 110311121032-phpapp02
Vstoragetamsupportday1 110311121032-phpapp02Suresh Kumar
 

More from Suresh Kumar (8)

Vsphere 4-partner-training180
Vsphere 4-partner-training180Vsphere 4-partner-training180
Vsphere 4-partner-training180
 
Vsphere4 100325065654-phpapp01
Vsphere4 100325065654-phpapp01Vsphere4 100325065654-phpapp01
Vsphere4 100325065654-phpapp01
 
Vmwareserver tips-tricks-110218231744-phpapp01
Vmwareserver tips-tricks-110218231744-phpapp01Vmwareserver tips-tricks-110218231744-phpapp01
Vmwareserver tips-tricks-110218231744-phpapp01
 
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
 
Managingvspherewiththevesi 091210144626-phpapp02
Managingvspherewiththevesi 091210144626-phpapp02Managingvspherewiththevesi 091210144626-phpapp02
Managingvspherewiththevesi 091210144626-phpapp02
 
Advancedtroubleshooting 101208145718-phpapp01
Advancedtroubleshooting 101208145718-phpapp01Advancedtroubleshooting 101208145718-phpapp01
Advancedtroubleshooting 101208145718-phpapp01
 
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
 
Vstoragetamsupportday1 110311121032-phpapp02
Vstoragetamsupportday1 110311121032-phpapp02Vstoragetamsupportday1 110311121032-phpapp02
Vstoragetamsupportday1 110311121032-phpapp02
 

Recently uploaded

APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 

Recently uploaded (20)

APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Vmwareperformancetroubleshooting 100224104321-phpapp02

  • 1. VMware Performance Troubleshooting Presented by Chris Kranz
  • 2. Topics Covered • Introduction • Root Cause Analysis • Performance Characteristics • CPU • Networking • Memory • Disk • Virtual Machine optimisation • ESXTop • vm-support • Service Console • Resource Groups • Design Guidelines • Capacity Planner limitations and cautions • Conclusion • Reference Articles
  • 3. Introduction Multiple layers of virtualisation are used to increase service levels, availability and manageability However, multiple layers of virtualisation often mask performance and configuration issues making it more of a challenge to troubleshoot and correct The worst out come is that performance issues after a virtualisation project lead to the perception that VMware results in reduced performance and future confidence in VMware can be affected
  • 4. Performance Basics • Virtual Machine Resources – CPU – Memory – Disk – Networking
  • 5. Resource Maximums Host Guest Logical Processors 64 N/A Virtual CPUs N/A 8 Virtual CPU’s per Core 20 N/A Memory 1TB 256GB http://www.vmware.com/pdf/vsphere4/r40/vsp_40_config_max.pdf
  • 6. Typical Host vSphere 1U Host CPU’s 2 x Quad Core Memory 32-64GB RAM Typical 3 VMs per core, 24VM’s per Host Each has 2GB of RAM = 48GB of RAM
  • 9. Monitoring Performance • Do not rely on guest tools, but – Can show high CPU, & Memory Utilisation – Measurement of Latency & throughput of Disk & Network Interfaces • Use the virtualisation layer, to diagnose cause: – Guest is unaware of virtualisation workload – The way in which guest OS’s account time is different – No visibility of available resources
  • 10. Performance Analysis Tools • esxtop (service console only) • resxtop (remote command line utilities) • Performance graphs in vCentre
  • 11. esxtop • esxtop can be run: – Interactively – Batch (eg. esxtop -a -b > analysis.csv) – Load batch into windows perfmon or MS Excel • Two keys to remember – H : help – F : fields to display
  • 12. esxtop basics Host Resources Name of Resource Pool, Virtual Number of Worlds Machine or World
  • 13. Performance Characteristics CPU Memory Networking Disk Slow Processing Slow Processing Packet Loss Log Stalls High CPU Wait Disk Swapping Slow Network Disk Queue Slow Application Performance Reduced User Experience Data Loss and Corruption
  • 14. CPU ESX Scheduler Basic World States Read / Run / Wait CPU States Service Virtual Ready / Usage / Wait Console Machine Limits / Shares / Reservations
  • 15. CPU High %RDY + High %User can imply over commitment esxtop •PCPU(%): CPU utilization •%USED: Utilization •%RDY: Ready Time •%RUN: Run Time •%WAIT: Wait and idling time
  • 16. CPU VI-Client Used Time > Ready Time: Possible CPU over-committment Used Time Ready Time
  • 17. CPU Further Investigation %MLMTD shows this VM has been limited
  • 18. CPU Further Investigation High ready time caused by CPU resource limit
  • 19. VMware Memory Management • Transparent Page Sharing • VMware Tools Balloon Driver to force the VM to swap to disk • Virtual Machine Page File
  • 20. Memory Ballooning vs. Swapping Ballooning driver causes the host to swap pages that it chooses to disk ESX Swapping will swap any pages to disk.
  • 21. Memory • Ballooning can be disabled (0 value) or controlled on a per Virtual Machine basis using: sched.mem.maxmemctl • Default is set to 65%, can be controlled at host level. • Only is an issue in resource contention scenarios. (or VM’s with low latency eg Citrix)
  • 22. Memory - Host VI Client shows memory usage of the host. This is calculated as “consumed + overhead memory + Service Console”. Performance charts are a very good way of showing the Virtual Machine memory breakdown. • Consumed Memory • Ballooned Memory • Shared Memory • Swapped Memory
  • 23. Memory - Guest Host Memory = Consumed + Overhead Memory Guest Memory = Active Memory for Guest OS
  • 24. Memory – Guest Overhead
  • 25. Memory Virtual Machine Memory Metrics – VI Client Metric Description Memory Active (KB) Physical pages touched recently by a VM Memory Usage (%) Active memory / configured memory Memory Consumed (KB) Machine memory mapped to a virtual machine, including its portion of shared pages. Doesn’t include overhead memory Memory Granted (KB) Physical pages allocated to a virtual machine. May be less than configured memory. Includes shared pages. Doesn’t include overhead memory. Memory Shared (KB) Physical pages shared with other virtual machines Memory Balloon (KB) Physical memory ballooned from a virtual machine Memory Swapped (KB) Physical memory in swap file (approx. “swap out – swap in”). Swap out and Swap in are cumulative Overhead Memory (KB) Machine pages used for virtualisation
  • 26. Memory Host Memory Metrics – VI Client Metric Description Memory Active (KB) Physical pages touched recently by the host Memory Usage (%) Active memory / configured memory Memory Consumed (KB) Total host physical memory – free memory on host. Includes Overhead and Service Console memory Memory Granted (KB) Sum of physical pages allocated to all virtual machines. Doesn’t include overhead memory. Memory Shared (KB) Physical pages shared by virtual machines on host Shared Common (KB) Total machine pages used by shared pages Memory Balloon (KB) Machine pages ballooned from virtual machines Memory Swap Used (KB) Physical memory in swap file (approx. “swap out – swap in”). Swap out and Swap in are cumulative Overhead Memory (KB) Machine pages used for virtualisation
  • 27. Memory PMEM: Total physical memory breakdown esxtop VMKMEM: Memory managed by vmkernel COSMEM: Service Console memory breakdown PSHARE: Page sharing statistics SWAP: Swap statistics MEMCTL: Balloon driver data
  • 28. Memory esxtop / VI Client metrics : Virtual Machines VI Client esxtop Active Memory TCHD Memory Usage %ACTV Consumed Memory N/A Memory Granted N/A (SZTGT and CMTTGT represent memory scheduler targets) Memory Shared SHRD (+SHRDSVD per VM). Must enable COW stats in ESXTOP Memory Balloon MCTLSZ Memory Swapped SWCUR (SWR/s & SWW/s are rates) Overhead Memory OVHD & OVHDMAX
  • 29. Memory esxtop / VI Client metrics : Host Usage VI Client esxtop Memory Active N/A (try /proc/vmware/sched/mem-verbose) Memory Usage N/A (try /proc/vmware/sched/mem-verbose) Memory Consumed PMEM total – PMEM free Memory Granted N/A (SZTGT and CMTTGT represent memory scheduler targets) Memory Shared PSHARE (shared) Memory Shared Common PSHARE (common) Memory Balloon MEMCTL Memory Swap Used SWAP (r/w and w/s are rates) Overhead Memory OVHD & OVHDMAX
  • 30. Memory VI Client memory usage graph
  • 31. Memory Troubleshooting Memory usage issues
  • 32. Networking •Switch Assisted Teaming (IP Hash) •VLAN Trunking •Flow Control (full) •Speed & Duplex (1000Mb / Full) •Port Fast •BPDU Disabled •STP Disabled •Link State Tracking •Jumbo Frames Network configuration is more likely to blame than resource contention
  • 33. Networking esxtop Transmit and Receive in Mb/s Transmit and Receive in Packets
  • 34. Networking esxtop Dropped Packets Transmit Drop Packets Received
  • 35. Disk Varying Factors • File system performance • Disk subsystem configuration (SAN, NAS, iSCSI, local disk) • Disk caching • Disk formats (thick, sparse, thin) ESX Storage Stack •Different latencies for different disks •Queuing within the kernel K: Kernel D: Device G: Guest
  • 36. Disk VI Client statistics Quite Coarse Statistics • Disk read / write rate (KB/s) • Disk usage: sum of read BW and write BW (KB/s) • Disk read / write requests (per 20s interval) • Bus resets / Command aborts (per 20s interval) •Per LUN or aggregated stats
  • 37. Disk esxtop statistics Aggregated stats similar to VI Client • Disk read / write per sec (READS/s, WRITES/s) • MB read / write per sec (MBREAD/s, MBWRTN/s) Latency Statistics • Kernel Average / command (KAVG/cmd) • Device Average / command (DAVG/cmd) • Guest Average / command (GAVG/cmd) Queuing Information • Adapter Queue Length (AQLEN) • LUN Queue Length (LQLEN) • VMKernel (QUED) • Active Queue (ACTV) • %Used (%USD = ACTV/LQLEN)
  • 38. Disk SAN Rough Estimates Purely looking at a single ESX host, roughly: Throughput (in MBps) = (Outstanding IOs * Block size in KB) / latency in msec FC, rough maximums: Effective Link Bandwidth = ~80/90% of Real Bandwidth Effective (2Gbps) = 200 – 230 MBps Effective (4Gbps) = 410 – 460 MBps Effective (8Gbps) = 820 – 920 MBps iSCSI / NFS / FCoE, rough maximums: Effective Link Bandwidth = ~70/80% of Real Bandwidth Effective (1GigE) = 90 – 100 MBps Effective (10GigE) = 900 – 1000 MBps
  • 39. Disk Desired Latency Calculations Desired Larency in msec <= (Outstanding IOs * Block size in KB) / Throughput per host Example: Number of Hosts: 16 Effective Link Bandwidth: 90 MBps Throughput per host: 90 / 16 = 5.6 MBps Desired Latency: (32 * 32) / (5.6) = 182.86 msec Workload Cached Sequential Read Cached Sequential Write Desired Latency (msec) 182.86 182.86 Observed Latency (msec) ~350 ~180 Throughput Drop? Yes No Throughput (MBps) ~45 ~90
  • 40. Disk SAN Cache enabled VI Client High throughput SAN Cache disabled Poor throughput
  • 41. Disk esxtop Latency is quite high After enabling cache, Latency is reduced
  • 42. Virtual Machine Optimisation Deploy all machines from an optimised template! • VMware tools MUST be installed • The disks MUST be block aligned to the storage (even when using NFS and SAN) • Where possible, always separate data disks from OS disks • Windows performance settings should be optimised for application performance • Guest operating system timeouts should be set as defined by the SAN vendor • Pagefile should be separated where appropriate (this can impact VMware SRM however) • Unused Windows services should be disabled (wireless config, print spooler, audio, etc.) • Last access update time should be disabled (unless where required) • Logging of the VM should be disabled (only enabled for troubleshooting) • Remove any unused virtual hardware (floppy drives, USB, etc.) • Disable screen savers and power saving features, including logon screen saver • Enable Remote Desktop, avoid using the VI Client for remote administration • Install standard applications into template (bginfo, AntiVirus, any host agents, etc) • Multiple-CPU’s should be allocated sparingly
  • 43. Virtual Machine Optimisation Block alignment is vital to good disk performance!
  • 44. Command Action esxtop space Update the display ? Show the help page Command Options q quit when inside esxtop f/F Add or Remove columns from the display o/O Change the order the display is sorted s change the update interval # change the number of instances to display W Write configuration to file e Expand / Rollup CPU Stats V View only VM instances L Change the length of the NAME field m Display memory statistics n Display network statistics i Display interrupt statistics d Display disk adapter statistics u Display disk device statistics v Display disk VM statistics
  • 45. esxtop Command Line Options from the console Command Action -b batch mode -l locks the objects available in the first snapshot -s enables secure mode -a show all statistics -c sets the configuration file -R enables replay mode (used with “vm-support –S”) -d sets the update interval -n runs esxtop for n iterations
  • 46. esxtop Expand the default window size for your session to get all statistics
  • 47. vm-support Creates a packaged zip file containing the following sections: • boot • contains the grub configuration • etc • contains the Console OS configuration files (cron, tcpwrappers, syslog, etc) • proc • contains much of the hardware configuration modules and variables • tmp • contains a lot of the ESX specific configuration output • var • contains log files and any core dumps • vmfs • contains the structure of the VMFS datastores • esx3-installation (where appropriate) • contains a copy if the previous esx3 configuration variables
  • 48. vm-support Using vm-support to extract performance information: vm-support –S –d <duration> -i <interval> <duration> and <interval> are in seconds The output from this can then be replayed in esxtop for review after it has been extracted. esxtop –R <path_to_vm-support_output>
  • 49. Service Console Performance •Multiple Service Console networks – for network resiliency •Increased Service Console memory – upto 800MB •Use host agents supplied by your vendors •Make storage recommended tweaks such as HBA Queue Depth and IO timeouts •Minimal use of the VI Client console – RDP or SSH instead •Properly sized vCenter server – 64bit OS where possible
  • 50. Resource Groups Dynamically reallocate resource shares Additional VM, shares allow you to over- commit resources and have a graceful re-allocation Remove a VM and exploit extra resources across all remaining VM’s
  • 51. Design Guidelines • Full Resilience / Multiple paths • Standard configuration across all aspects (ESX, Storage, Networking, etc.) • Standard naming conventions • Learn from others mistakes • Follow guidelines from vendors best-practices • Rule out the basics before requesting support
  • 52. Capacity Planner & P2V Cautions and Limitations • Peak CPU usage can sometimes be misleading • Back-end storage system performance • P2V machines will require block-aligning to the storage • P2V machines will still require guest OS optimisation
  • 53. Conclusion • Performance issues can often be traced with simple root cause analysis using basic tools (VI Client / esxtop) • Performance tools help diagnose issues and help rule out non- issues • Performance tools are useful in different contexts, not always either/or • Real-time data and troubleshooting: esxtop • Historical data: VI Client • Coarse resource / cluster usage: VI Client • Detailed resource usage: esxtop • Combine information from various tools to get a complete picture • Always benchmark your systems first so you not what the optimal performance is that you can receive
  • 54. Reference Articles • http://www.vmware.com/pdf/esx3_memory.pdf • http://www.vmworld.com/docs/DOC-2370 • http://blogs.vmware.com/performance/ • http://communities.vmware.com/docs/DOC-5420 • http://kb.vmware.com/kb/1008205 • http://communities.vmware.com/community/vmtn/general/performance • http://www.vmware.com/products/vmmark/ • http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf • http://www.vmware.com/pdf/vsphere4/r40/vsp_40_iscsi_san_cfg.pdf • http://www.vmware.com/pdf/vsphere4/r40/vsp_40_resource_mgmt.pdf • http://www.vmware.com/pdf/GuestOS_guide.pdf • http://www.vmware.com/resources/techresources/10066 • http://www.vmware.com/resources/techresources/10059 • http://www.vmware.com/resources/techresources/10062