VMworld 2013: Performance and Capacity Management of DRS Clusters

1,108 views
942 views

Published on

VMworld 2013

Anne Holler, VMware
Ganesha Shanmuganathan, VMware

Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,108
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

VMworld 2013: Performance and Capacity Management of DRS Clusters

  1. 1. Performance and Capacity Management of DRS Clusters Anne Holler, VMware Ganesha Shanmuganathan, VMware VSVC5821 #VSVC5821
  2. 2. 2 Disclaimer  This session may contain product features that are currently under development.  This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product.  Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.  Technical feasibility and market demand will affect final delivery.  Pricing and packaging for any new technologies or features discussed or presented have not been determined.
  3. 3. 3 DRS = Distributed Resource Scheduler • The overall design goals of DRS are: • Optimize VM performance subject to user control settings • Provide resource isolation and sharing for subsets of VMs • Use infrastructure and management resources efficiently • Provide comprehensive automatic cluster management • Mechanisms: • Initial placement / Load balancing • QoS enforcement: shares, reservations, limits, resource pools • Policy enforcement: Affinity Rules, Anti-Affinity Rules • Evacuation for host maintenance VM VM VM ESX Server VM ESX Server VM VM VM ESX Server VMVM VMVM DRS Cluster
  4. 4. 4 Key elements to achieve design goals  Computing/Delivering VM CPU & memory resource entitlements (Automatic VM Initial placement and automatic Migration)  Mapping the cluster resource pool tree onto individual hosts  Modeling vMotion remediation costs  Respecting constraints: compatibility, availability, host state, rules  Let’s look at each of these elements and examine advanced deployment situations for each along with tips to handle them…
  5. 5. 5 DRS Advanced Options
  6. 6. 6 Computing/Delivering VM CPU and Memory resource entitlements
  7. 7. 7 • VM's dynamic entitlement (DE) is what VM would get if cluster were one giant host • Takes into account VM resource controls and demand Dynamic Entitlement 1 “giant host” CPU = 60 GHz Memory = 384 GB 6 hosts, each: CPU = 10 GHz Memory = 64 GB
  8. 8. 8 VM Resource Controls  Reservation: Guaranteed allocation  Limit: Guaranteed upper bound  Shares: Allocation in between  Resource pools: allocation and isolation for groups of VMs Actual allocation depends on the shares and demand Configured (8GB) Limit (6GB) Actual(5GB) Reserved (1GB) 0
  9. 9. 9 CPU Entitlement: Close-up on CPU Demand Estimate  By ESX: • CPU Demand = used + stolen * run / (run + sleep) • Stolen time includes: • ready: vCPU is runnable but target CPU is busy • overlap: Use of CPU to handle interrupts during this vCPU execution • hyperthreading: Impact on CPU operation due to use of partner CPU • power management: Loss of CPU cycles due to platform frequency scaling http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-Sched-Perf.pdf  By DRS: • CPU Demand = ESX CPU Demand over time • load balancing: average over last 5 minutes • cost/benefit & DPM: maximum over extended period (up to 60 minutes)
  10. 10. 10 CPU Entitlement: Satisfies VM demand unless Contention  Contention monitoring: Ready time • Rule of thumb: Ready <=5% per vCPU • E.g., 4 vCPU VM <= 20% during low contention periods below • Higher ready values do not necessarily indicate problems • E.g., NUMA scheduling: vCPU has affinity for node containing memory • DRS considers ready in CPU demand, but ready imbalance has been reported Usage Ready
  11. 11. 11 CPU Entitlement Scenarios  High CPU ready; manual vMotion improved performance • Why ? DRS averaging underestimated demand of spiky CPU workload • Tip/2013: Introduced AggressiveCPUActive advanced option • Option uses larger of: • 5 minute average of ESX CPU demand [matches pre-2013 computation] • 80% percentile (2nd largest) of the last five 1 minute average ESX CPU demand
  12. 12. 12 CPU Entitlement Scenarios, continued  CPU ready time high, but host lightly utilized • Why? Platform-level power management operating beneath ESX layer • Tip: Set BIOS power mgmt setting to OS Control http://blogs.vmware.com/vsphere/2012/01/having-a-performance-problem-hard-to-resolve-have-you- checked-your-host-bios-lately.html
  13. 13. 13 Memory Entitlement: Close-up on Memory Demand Estimate  By ESX: • Memory Demand = max last 4 minutes of oneMinuteActive • oneMinuteActive computed as follows: • Unmap random sample of pages • Check mapping activity in a minute • Scale up to memSize • Accounts for: Large or Swapped or Ballooned pages  By DRS: • Memory Demand = ESX Memory Demand over time + headroom: • load balancing: average over last 5 mins + 25% (default) of consumed idle • cost/benefit & DPM: maximum over extended period + 0% consumed idle 4GB Consumed Idle1GB ESX Demand 2GB DRS Demand http://www.vmware.com/files/pdf/mem_mgmt_perf_vsphere5.pdf
  14. 14. 14 Memory Entitlement: Satisfies VM demand unless Contention  Contention monitoring: Amount of reclamation • Reclamation: Ballooning, page sharing, compression, ESX swapping • Customers often monitor for non-zero ballooning & ESX swapping values • Some ballooning can allow more efficient memory usage • DRS memory demand estimate has been lower than desired in some cases Ballooning might cause guest internal swapping
  15. 15. 15 Memory Entitlement Scenarios  Scenario: DRS performed undesired VM memory imbalance move • Why? DRS demand includes 25% idle • Customer preferred 100%, i.e., wanted DRS to manage consumed as demand • Tip/2013: Use new option PercentIdleMBInMemDemand (default 25%) • Can be set to 100% to have both load-balancing & cost-benefit manage to consumed • LegacyTip: Change IdleTax option (default 75% -> 25% idle added [100-75]) • Only influences load-balancing
  16. 16. 16 Memory Entitlement Scenarios, continued  Scenario: Customer found DPM memory consolidation too high • Why? DPM consolidated on active mem w/o adding any idle consumed • Customer wanted to reduce later impact of demand paging ballooned/swapped pages • Tip/2013: Added new option PercentIdleMBInMemDemand (default 25%) • Can be set to 100% to have DPM manage to consumed *Image source:http://www.dailymail.co.uk/news/article-1292411/Record-temperatures-China-drive-hundreds-water.html *
  17. 17. 17 Mapping Cluster Resource Pools onto the Hosts
  18. 18. 18 Resource Pools in Cluster [reservation (MHz/MB), limit (MHz/MB)] RP2 200, 400 Root VM 2 1000 VM 1 2000 VM 3 100 100, 400 400 100, 8000 100, 8000 150 shares Resource Pools 500, 8000 RP1 150
  19. 19. 19 Resource Pools in Cluster  Cluster wide Resource Pools VM 2 1000 VM 1 2000 RP1 500, 8000 100, 8000 100, 8000 150 Root D=200 D=500 RP2 200, 400 VM 3 100 100, 400 400 D=1000 Host A VM 1 2000 RP1 200, 3000 100, 8000 100 Host B VM 2 1000 RP1 300,5000 100, 8000 50 Host A Host B
  20. 20. 20 Resource Pools in Cluster Host A RP2 200, 400 VM 1 2000 VM 3 100 100, 400 400 RP1 200, 3000 100, 8000 100 Host B VM 2 1000 RP1 300,5000 100, 8000 50 D=200 D=1000 D=500
  21. 21. 21 Mapping cluster RP tree onto individual hosts Scenario  DRS RP flow too slow to maintain desired VM performance • Why?: Conservative host RP reservations, limits capped spike response • Tip: Set CapRpReservationAtDemand to 0 (default :1) to have DRS distribute all RP reserved resources, rather than just needed for demand • Tip/2013: Set AllowUnlimitedCpuLimitForVms to 0 (default : 1) to have DRS distribute limits as much as possible Host A VM 1 100, 8000 VM2 Host B RP1 100,2000 100, 8000 50 RP1 100 380,5000 400,6000120,3000 VM2
  22. 22. 22 Modeling vMotion Remediation costs
  23. 23. 23 CostBenefit Filtering  Benefit: Higher resource availability  Cost: • Migration cost: vMotion CPU & memory cost, VM slowdown • Risk cost: Benefit may not be sustained due to load variation Gain (MHz or MB) Migration Time Stable Time Benefit Migration cost Risk cost Time (sec) Invocation Interval 0 Loss
  24. 24. 24 CostBenefit Filtering  vMotions caused unacceptable performance degradation • Why?: DRS CB didn't capture high sensitivity of VMs to vMotion • Tip: Tune CB to model vMotion aggressively Set IgnoreDownTimeLessThan to 0 Assumes UseDownTime is set to 1 (Default : 1) Gain (MHz or MB) Migration Time Stable Time Benefit Migration cost Risk cost Time (sec) Invocation Interval 0 Loss Gain (MHz or MB) Migration Time Stable Time Benefit Migration cost Risk cost Time (sec) Invocation Interval 0 Loss
  25. 25. 25 Modeling vMotion costs Scenario  DRS handles VMs known to be highly sensitive to vMotion • 2013: powered-on low-latency VMs treated as soft-affine w/current host • 2013: powered-on VMs w/vFlashCache reservations soft-affine w/host SSDvFlash
  26. 26. 26 vMotion costs Scenario DRS left cluster severely imbalanced VM-Happiness is primary metric DRS filtered moves aggressively • By default DRS becomes more aggressive when imbalance is severe: • FixSevereImbalanceOnly = 1 (default) • SevereImbalanceRelaxMinGoodness = 1 (default) • SevereImbalanceRelaxCostBenefit = 1 (default) • Tip/extreme: If above defaults still leave more balance than desired, can use: • UseDownTime = 0 • FixSevereImbalanceOnly = 0 (handle with care!) • SevereImbalanceDropCostBenefit = 1 (handle with care!)
  27. 27. 27 Respecting Constraints
  28. 28. 28 Respecting constraints (e.g., availability, rules)  Customers may express business rules to influence load-balancing • E.g.: use VM-VM anti-affinity rules for availability • E.g.: use VM-host affinity rules for software licensing Host Group Anti-Affinity
  29. 29. 29 Asymmetric Cluster Scenario  Asymmetric storage or network access cost • E.g., if moving VM from set of hosts would cause higher network latency for storage as it has to cross racks or do L2 over L3 network, use soft affinity rule to keep VMs on hosts with lower access cost ToR switch Router ToR switch DRS Cluster
  30. 30. 30 Respecting constraints (e.g., availability, rules) Scenarios  Stretch cluster with VMs on hosts with primary storage • Current solution of using soft VM/Host rules to partition VMs between sites will allow VMs to violate rules if any host over-utilized. • Tip/2013: Added support for semi-hard VM/host rules (only drop soft VM/host rules for constraints, not for high utilization) via option DropSoftVmHostRulesOverutilized = 1 WAN Replication
  31. 31. 31 Respecting constraints (e.g., availability, rules) Scenarios  More VMs/host than wanted wrt failure impact (eggs/basket) • Why?: By default, DRS allows up to ESX supported VM limit and is balancing CPU and memory, not number of VMs • Tip: Use LimitVMsPerESXHost option. Restrict the number of VMs on the host Inflexible, requires manual scaling Eg: LimitVMsPerESXHosts = 6 • Tip/2013: Use LimitVMsPerESXHostPercent option. Restrict the number of VMs based on tolerance. Flexible, automatic. Number of VMs on host = Mean + (Buffer% * Mean)
  32. 32. 32 Capacity Management
  33. 33. 33 Cluster - Capacity Sum (VM CPU reservations ) < 75% of the cluster CPU capacity available for VMs Sum (VM Memory reservations + overhead) < 75% of the cluster Memory capacity available for VMs
  34. 34. 34 Cluster - Capacity  For maximum performance of all VMs in the cluster Sum (VM demands) < 80% of the cluster capacity  DRS starts throttling less important VMs as demand gets closer to/exceeds capacity
  35. 35. 35 Capacity Management Scenarios  vMotions becoming slower in the cluster Why? Due to unreserved CPU on the host being lesser than 30% of a core. vMotion tries to reserve 30% of a core for the vMotion process and if that fails, vmotion may proceed at slower rate  Lot of VM Power on failures in a DRS cluster Why? Due to not enough un-reserved memory in the cluster/sub-cluster to satisfy the powering ON VM’s reservation or overhead.
  36. 36. 36 Questions Questions, Comments, Additional Scenarios? DRS Survey: http://www.vmware.com/go/drssurvey ? ? ?? ?
  37. 37. 37 Related material  Other DRS talks at VMworld 2013 • VSVC5280 - DRS: New Features, Best Practices and Future Directions (11 am Monday & 11 am Tuesday) • STO5636 - Storage DRS: Deep Dive and Best Practices to Suit Your Storage Environments (4 pm, Monday & 12:30 pm Tuesday) • VSVC5364 - Storage IO Control: Concepts, Configuration and Best Practices to Tame Different Storage Architectures (8:30 am, Wed. & 11 am, Thursday)  From VMworld 2012 • VSP2825 - DRS: Advanced Concepts, Best Practices and Future Directions  VMware Technical Journal publications • VMware Distributed Resource Management: Design, Implementation, and Lessons Learned • Storage DRS: Automated Management of Storage Devices In a Virtualized Datacenter More related publications at http://labs.vmware.com/academic/publications
  38. 38. THANK YOU
  39. 39. Performance and Capacity Management of DRS Clusters Anne Holler, VMware Ganesha Shanmuganathan, VMware VSVC5821 #VSVC5821
  40. 40. 41 Backup Slides
  41. 41. 42 Cluster Creation - Hardware  Heterogeneous components vs. homogenous components  Have vMotion compatible hosts (EVC helps)  DRS supports heterogeneous hosts, storage and connectivity Compute Network Storage Switches Router
  42. 42. 43 Cluster Creation – Connectivity  Avoid Storage islands  Network should span entire cluster  Handle Network/Storage bottlenecks/slow links using VM-Host affinity rules Storage Switch Router Switch
  43. 43. 44 VM Creation Important VM Parameters:  Number of vCPUs  Memory Size  CPU/Mem Reservation  Shares RP Hierarchy VM - Rules
  44. 44. 45 VM Sizing  Too many vCPUs wastes overhead  Too little vCPUs may cause application to perform poorly (Check if all vCPUs are near 100% used)  Too much memory would cause excessive ballooning  Too little memory may cause guest internal swapping (Check for swap statistics inside the guest)  VM needs may vary with time. Pick maximum of what the VM needs
  45. 45. 46 Case Study: Stretch cluster Use soft VM/Host rules to partition VMs between sites New: Added optional support for semi-hard VM/host rules (only drop soft VM/host rules for constraints not for high utilization) WAN
  46. 46. 47 Cluster - Capacity Sum (VM reservations + overhead) < 75% of the cluster capacity available for VMs
  47. 47. 48 Cluster Creation – Hardware  Heterogeneous components vs. homogenous components  Have vMotion compatible hosts (EVC helps)  DRS supports heterogeneous hosts, storage and connectivity Compute Network Storage Switches Router
  48. 48. 49 Cluster Creation – Connectivity  Avoid Storage islands  Network should span entire cluster  Handle Network/Storage bottlenecks/slow links using VM-Host affinity rules Storage Switch Router Switch
  49. 49. 50 VM Creation Important VM Parameters:  Number of vCPUs  Memory Size  CPU/Mem Reservation  Shares RP Hierarchy VM - Rules
  50. 50. 51 VM Sizing  Too many vCPUs wastes overhead  Too little vCPUs may cause application to perform poorly (Check if all vCPUs are near 100% used)  Too much memory would cause excessive ballooning  Too little memory may cause guest internal swapping (Check for swap statistics inside the guest)  VM needs may vary with time. Pick maximum of what the VM needs
  51. 51. 52 Case Study: Stretch Cluster • Use soft VM/Host rules to partition VMs between sites • New: Added optional support for semi-hard VM/host rules (only drop soft VM/host rules for constraints not for high utilization) WAN
  52. 52. 53 Cluster – Capacity Sum (VM reservations + overhead) < 75% of the cluster capacity available for VMs
  53. 53. 54 Cluster – Capacity  Sum (VM demands) should be < 80% of the cluster capacity available for VMs
  54. 54. 55 Demand
  55. 55. 56 Cluster – Capacity  Number of hosts – As big as possible  HA – Enable admission control Advanced knobs to make DRS consider number of VMs
  56. 56. 57 Performance Case Studies
  57. 57. 58 VMs in Resource Pool not getting same performance [reservation (MHz/MB), limit (MHz/MB)] RP2 200, 400 Root VM 2 1000 VM 1 2000 VM 3 100 100, 400 400 RP1 400, 8000 100, 8000 100, 8000 150 share s Resource Pools
  58. 58. 59 Resource Pools - Performance  Cluster wide Resource Pools Root VM 2 1000 VM 1 2000 RP2 200, 400 VM 3 100 100, 400 400 RP1 400, 8000 100, 8000 100, 8000 150 Host A Host B Host A VM 1 20 00 RP1 150, 3000 100, 8000 100 Host B VM 2 10 00 RP1 250,5000 100, 8000 50
  59. 59. 60 Resource Pools - Performance Host A R P2 200, 400 VM 1 20 00 VM 3 10 0 100, 400 400 RP1 150, 3000 100, 8000 100 Host B VM 2 10 00 RP1 250,5000 100, 8000 50
  60. 60. 61 Resource Pools - Performance  DRS flows resources between hosts every 5 minutes  Tuned to minimize the number of migrations by throttling this flow. More aggressive settings to flow the resources:  Advanced Control Knobs (1) CapRpReservationAtDemand – False (default True) (2) AllowUnlimitedCpuLimitForVms – False (default True) (1) Would flow reservations more aggressively (2) Would flow limits more aggressively
  61. 61. 62 CPU Management  Reference: http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-CPU-Sched-Perf.pdf
  62. 62. 63 CPU Management  Demand: How CPU Demand (aka CPU Active) is estimated • By ESX: CPU time VM would consume if there were no stolen time • CPU Demand = used + stolen * run / (run + sleep), where stolen time includes: • Ready: VCPU is runnable but target CPU is busy • Overlap: Use of CPU to handle interrupts during this VCPU execution • Hyperthreading: Impact on CPU operation due to use of partner CPU • PowerManagement: Loss of CPU cycles due to platform frequency scaling • By DRS load balancing: Average ESX CPU demand over last 5 minutes • DRS Cost/Benefit & DPM power-off consider ~max over longer periods
  63. 63. 64 CPU Management, con’t  Reservation: Impact • Ensures via admission control VM can obtain reserved CPU when demanded • Work-conserving; other VMs use reserved CPU when VM doesn’t demand it  Ready time metric • General rule of thumb is that it should be 5% or less per VCPU • Values higher than this do not necessarily indicate problems • Check out discussion & chime in on: • http://www.yellow-bricks.com/2013/05/09/drs-not-taking-cpu-ready-time-in-to-account-need-your-help/ • Some troubleshooting case studies follow
  64. 64. 65 CPU Management Case Studies  Case: CPU ready time metric high, but host not heavily utilized • Explanation: NUMA scheduling favors running on CPU near local memory • Fix: none; this scheduling gives better performance • http://blogs.vmware.com/vsphere/2012/02/vspherenuma-loadbalancing.html
  65. 65. 66 CPU Management Case Studies  Case: CPU ready time metric high, but host lightly utilized • Explanation: Platform-level power management enabled • Fix: Set BIOS power mgmt setting to Maximum or OS Control
  66. 66. 67 CPU Management Case Studies  Case: CPU ready time metric high, better perf after manual move • Explanation: DRS average underestimated demand of spiky CPU workloads • Fix: Introduced AggressiveCPUActive advanced option, which uses larger of: • the 5 minute average of ESX CPU demand • the 80% percentile (2nd largest) of the last 5 1 minute average ESX CPU demand
  67. 67. 68 Memory Management  Reference: http://www.vmware.com/files/pdf/mem_mgmt_perf_vsphere5.pdf
  68. 68. 69 Memory Management  Demand: How memory demand (aka memory active) is estimated • By ESX • Statistical: unmap small random sample of pages each minute, see what percentage are referenced, assume that percentage of mapped pages are active; take max of last 4 minutes • By DRS • For load balancing: Average ESX memory demand over last 5 minutes + percentage (default 25) of idle consumed memory • DRS C/B and DPM power-off consider ~max over longer periods
  69. 69. 70 Memory Management, con’t  Reservation: Impact • Ensures via admission control VM can obtain reserved amount of memory • Not work-conserving; once reserved memory consumed, not reallocated  Reclamation • Ballooning, transparent page sharing, compression, ESX swapping  Ballooning/swapping metrics • Customers often monitor for non-zero values • No over-commitment can lead to high memory cost
  70. 70. 71 Memory Management Case Studies  Case: Undesirable VM migration for memory imbalance • Explanation: DRS managing active memory, Customer wanted DRS to manage consumed • Fix: Use IdleTax option to include more idle memory in active • In vSphere 5.5, added new option PercentIdleMBInMemDemand (default 25%) which can be set to 100% to manage to consumed
  71. 71. 72 Memory Management Case Studies  Case: DPM overconsolidated memory • Explanation: DPM consolidating on active memory, Customer wanted DPM to use consumed • Fix: Added new option PercentIdleMBInMemDemand (also can be used instead of IdleTax)
  72. 72. 73 Big Picture vCenter DRS SDRSDPM vCloud Director ESX ESX

×