• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
VMware vSphere Performance Troubleshooting
 

VMware vSphere Performance Troubleshooting

on

  • 10,099 views

From the Lewan

From the Lewan

Statistics

Views

Total Views
10,099
Views on SlideShare
9,107
Embed Views
992

Actions

Likes
3
Downloads
492
Comments
0

5 Embeds 992

http://blog.lewan.com 979
http://feeds.feedburner.com 7
http://www.linkedin.com 3
http://translate.googleusercontent.com 2
http://paper.li 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Who uses Resource Pools? How many have reservations or limits?
  • Use a Host CPU stacked (per VM) graph to quickly identify leading consumers
  • Don’t necessary need CPU saturation for overcommit to have an effect on performance
  • Don’t necessary need CPU saturation for overcommit to have an effect on performance
  • Don’t necessary need CPU saturation for overcommit to have an effect on performance

VMware vSphere Performance Troubleshooting VMware vSphere Performance Troubleshooting Presentation Transcript

  • vSphere Performance Monitoring and Troubleshooting
    Overview
    What?
    CPU, Memory, Disk, Network
    How?
    Use available tools and a systematic methodology
    Why?
    Need to build confidence in virtualizing critical and high demand applications
  • vSphere Performance Monitoring and Troubleshooting
    Top Issues
    Top Issues:
    Storage "performance capacity" oversubscription
    Memory oversubscription
    SMP overuse
    Firmware & driver issues
  • vSphere Performance Monitoring and Troubleshooting
    What tools do we have at our disposal?
    Top tools for information collection:
    vCenter - Performance charts and alarms
    Guest OS* - Task Manager/Resource Monitor and PerfMon
    ESX Host - esxtop and vscsiStats
    vSpherePowerCLI
    *Guest based monitoring is subject to inaccuracy
  • vSphere Performance Monitoring and Troubleshooting
    Prepare vCenter Settings
  • vSphere Performance Monitoring and Troubleshooting
    Prepare vCenter Settings
  • vSphere Performance Monitoring and Troubleshooting
    Prepare vCenter Settings
    Prepare custom vCenter alerts:
    Host Console Swap In Rate  512KBps Warning, 1024 KBps Alert
    Host Console Swap Out Rate  512KBps Warning, 1024 KBps Alert
    VM CPU Ready  1000ms Warning, 2000ms Alert
    VM Disk Latency  20ms Warning, 50ms Alert
  • vSphere Performance Monitoring and Troubleshooting
    Prepare vCenter Settings
  • vSphere Performance Monitoring and Troubleshooting
    Prepare vCenter Settings
  • vSphere Performance Monitoring and Troubleshooting
    Prepare esxtop
    ESXTOP realtime monitoring:
    esxtop(run command from SSH or tech-support mode)
    s 2 (refresh view every 2 seconds)
    V (View VMs only)
    h(for quick in-tool command reference)
    Batch Mode for a 5 minute capture of all stats:
    esxtop-b -a -d 2 -n 150 > esxtop_capture.csv
  • vSphere Performance Monitoring and Troubleshooting
    Prepare PowerCLI
    Run PowerCLI:
    Tip: Run as Administrator
    Set-ExecutionPolicyremotesigned
    Connect-VIServer -Server <host> -Protocol https -User <user> -Password <pass>
    <host> can be IP address or name of ESX server or vCenter
    Get-VM
    Get-Stat -common -realtime
  • vSphere Performance Monitoring and Troubleshooting
    Where do we get started?
  • vSphere Performance Monitoring and Troubleshooting
    Network Overview
  • vSphere Performance Monitoring and Troubleshooting
    Network
    Troubleshooting Guidance:
    1. Physical Issues - A bad cable, a failing switch port or NIC, or an incompatible/flawed firmware or device driver (use VMXNET3 whenever possible)
    2. Configuration Issues - Inconsistent configuration of vSwitches, Port Groups, or upstream VLAN trunks
    3. Capacity Issues - Too many VMs on a single NIC; inadequate switch backplane or uplink capacity; sharing “unmanaged” network infrastructure for storage and data
    4. Thresholds – Bandwidth saturation, dropped packets
  • vSphere Performance Monitoring and Troubleshooting
    Network – What can we see?
  • vSphere Performance Monitoring and Troubleshooting
    Network
    vCenter Metrics:
    Receive packets dropped
    Transmit packets dropped
  • vSphere Performance Monitoring and Troubleshooting
    Network
    ESXTOP Metrics:
  • vSphere Performance Monitoring and Troubleshooting
    Network
    ESXTOP Commands:
    esxtop
    s 2
    n
    f
  • vSphere Performance Monitoring and Troubleshooting
    Network
    ESXTOP Example:
  • vSphere Performance Monitoring and Troubleshooting
    Network
    PowerCLI Commands:
    Get-Stat -net -realtime
    Get-Stat -Entity <Host> -stat net.droppedRx.summation
    Get-Stat -Entity <Host> -stat net.droppedTx.summation
  • vSphere Performance Monitoring and Troubleshooting
    Network – What can’t we see?
  • vSphere Performance Monitoring and Troubleshooting
    Network
    Possible resources for external monitoring:
    Native Telnet/SSH/HTTP-based interface counters and stats
    Third-party SNMP, NetFlow and ICMP tools
  • vSphere Performance Monitoring and Troubleshooting
    CPU Overview
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    Troubleshooting Guidance:
    1. Physical Issues - Rare and always catastrophic (e.g. obvious)
    2. Configuration Issues - Too many / too few vCPUs per VM; SMP/HAL mismatch; incorrect CPU affinity settings
    3. Capacity Issues - CPU saturation at the guest or host level; CPU starvation due to high IO or other system level ops
    4. Thresholds – Waiting for CPU cycles (due to co-scheduling, swapping, high IO)
  • vSphere Performance Monitoring and Troubleshooting
    CPU – What can we see?
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    vCenter Metrics:
    Host/Guest Saturation
    Stacked Graph (per VM)
    Usage
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    vCenter Metrics:
    Guest
    Ready (value/20=n%)
    Swap Wait
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    ESXTOP Metrics:
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    ESXTOP Commands:
    esxtop
    s 2
    V
    c
    e GID (expand/contract a VM world)
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    ESXTOP Example:
    Excessive vCPUs
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    ESXTOP Example:
    Now with fewer vCPUs
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    ESXTOP Example:
    SMP impacting multiple VMs
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    PowerCLI Example
    Get-Stat -cpu
    Get-Stat -Entity <VM> -stat cpu.ready.summation -realtime
    Very cool script code at:
    http://www.peetersonline.nl/index.php/vmware/examine-vmware-cpu-ready-times-with-powershell/
  • vSphere Performance Monitoring and Troubleshooting
    CPU – Not much else to see…
  • vSphere Performance Monitoring and Troubleshooting
    CPU
    Possible resources for external monitoring:
    Vendor specific systems management tools,
    MS System Center, etc.
    http://www.peetersonline.nl/index.php/vmware/examine-vmware-cpu-ready-times-with-powershell/
  • vSphere Performance Monitoring and Troubleshooting
    Memory Overview
  • vSphere Performance Monitoring and Troubleshooting
    Memory
    Troubleshooting Guidance:
    1. Physical Issues - Rare and usually catastrophic
    2. Configuration Issues - Memory overcommit; incorrect configuration of shares, reservations or limits
    3. Capacity Issues - Physical memory exhaustion
    4. Thresholds – Active memory swapping
  • vSphere Performance Monitoring and Troubleshooting
    Memory – What can we see?
  • vSphere Performance Monitoring and Troubleshooting
    Memory
    vCenter Metrics
    Swap in rate
    Swap out rate
    Swap used
  • vSphere Performance Monitoring and Troubleshooting
    Memory
    ESXTOP Metrics:
  • vSphere Performance Monitoring and Troubleshooting
    Memory
    ESXTOP Commands:
    esxtop
    s 2
    V
    m
    f
  • vSphere Performance Monitoring and Troubleshooting
    Memory
    ESXTOP Example:
    m – Heavy swapping and ballooning
  • vSphere Performance Monitoring and Troubleshooting
    Memory
    PowerCLI Commands:
    Get-Stat -mem
    Get-Stat -Entity <VM> -stat mem.swapoutRate.average -realtime
    Get-Stat -Entity <VM> -stat mem.swapinRate.average -realtime
    Get-Stat -Entity <VM> -stat mem.vmmemctl.average -realtime
    Get-Stat -Entity <Host> -stat mem.swapused.average -realtime
  • vSphere Performance Monitoring and Troubleshooting
    Memory – The occasional DIMM failure…
  • vSphere Performance Monitoring and Troubleshooting
    Memory
    Possible external monitoring options:
    Vendor specific systems management tools, MS System Center, etc.
    Don’t forget vCenter ‘Hardware Status’ reporting
  • vSphere Performance Monitoring and Troubleshooting
    Storage Overview
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    Troubleshooting Guidance:
    1. Physical Issues - A bad cable, a failing switch port or HBA/NIC, or an incompatible/flawed firmware or device driver (use LSI Logic Parallel/SAS as appropriate)
    2. Configuration Issues - Inconsistent or incorrect configuration of LUN masking, zoning, or multi-pathing; inappropriate resource provisioning; aligning queue depth with storage type
    3. Capacity Issues - Too many VMs or VMDKs on a LUN; too much IO load for an array or RAID group
    4. Thresholds – Latency and queuing
  • vSphere Performance Monitoring and Troubleshooting
    Storage – What can we see?
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    vCenter Metrics:
    Datastore
    Read latency
    Write latency
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    ESXTOP Metrics:
  • vSphere Performance Monitoring and Troubleshooting
    Storage
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    ESXTOP Commands (HBA/LUN):
    esxtop
    s 2
    V
    d
    f
    e vmhba#
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    ESXTOP Commands(LUN/Datastore):
    esxtop
    s 2
    V
    u
    L 38
    f
    e <devname>
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    ESXTOP Commands (VM/VMDK):
    esxtop
    s 2
    V
    v
    f
    e GID
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    ESXTOP Examples:
    d - Multipathing / Expand adapter to view targets
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    ESXTOP Examples:
    u - Queuing, Disk or Kernel?
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    ESXTOP Examples:
    v - Identify the IO consumer
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    vscsiStatsCommand:
    [root@host ~]# cd /usr/lib/vmware/bin
    ./vscsiStats -l
    ./vscsiStats -s -w <worldid>
    ./vscsiStats -w <worldid> -p all -c > /path/vscsistats.csv
    ./vscsiStats -x
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    vscsiStatsExample:
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    vscsiStatsExample:
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    vscsiStatsExample:
    http://dunnsept.wordpress.com/2010/03/11/new-vscsistats-excel-macro/
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    vscsiStatshistograms:
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    PowerCLI Commands:
    Get-Stat -disk
    Get-Stat -stat disk.totalLatency.average -realtime
    Get-Stat -stat disk.deviceLatency.average -realtime
    Get-Stat -stat disk.kernelLatency.average -realtime
  • vSphere Performance Monitoring and Troubleshooting
    Storage – What can’t we see?
  • vSphere Performance Monitoring and Troubleshooting
    Storage – More of what we can’t see
  • vSphere Performance Monitoring and Troubleshooting
    Storage
    Possible external monitoring solutions:
    Vendor specific SAN and fabric/network tools, native Telnet/SSH/HTTP-based tools for most networks, third-party SNMP-based tools
  • vSphere Performance Monitoring and Troubleshooting
    Working with PowerCLI
    PowerCLI Tips:
    For a complete list of stat objects:
    Get-StatType -Entity <Host/VM>
    Pipe the outputs to a file:
    Get-Stat -stat <stat> -realtime | ft -autosize > c:temp<filename>.csv
    Import the CSV file data to a spreadsheet with fixed width parameters
    Build pretty graphs
  • vSphere Performance Monitoring and Troubleshooting
    Working with PowerCLI
  • vSphere Performance Monitoring and Troubleshooting
    Way More Information
    ESXTOP / vscsiStats / PowerCLI:
    http://www.yellow-bricks.com/esxtop/ Special thanks to Duncan Epping!
    http://communities.vmware.com/docs/DOC-3930
    http://communities.vmware.com/docs/DOC-9279
    http://communities.vmware.com/docs/DOC-10095
    http://www.vmware.com/support/developer/PowerCLI/PowerCLI41/html/Get-Stat.html
    http://www.lucd.info/2009/12/30/powercli-vsphere-statistics-part-1-the-basics/
    http://simongreaves.co.uk/blog/esxtop-guide
    http://dunnsept.wordpress.com/2010/03/11/new-vscsistats-excel-macro/
  • vSphere Performance Monitoring and Troubleshooting
    Easy button?
    What is the problem with these tools?
    Limited alerting mechanisms, no collection automation or historical data for comparison, and no correlation of events!
    vCenter Operations Standard / Enterprise