Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs

1,075 views

Published on

One of the largest points of contention with virtual SQL Servers and the VM administrators is how to configure the CPUs. Experience says more CPUs are better for performance. VM admins say less is better. Third-party vendors say you need all of them (and it doesn’t matter how many your hosts have either). Can over-provisioning virtual machine CPUs speed things up, or does it slow things down? What is the right methodology to determine the correct number of virtual CPUs? How does this configuration align with the physical servers? From sampling and analyzing performance data, to “right-sizing’ your SQL Server virtual machine CPU count, to properly aligning the VM with the physical server NUMA topology, you will gain the understanding of how to properly manage and validate your virtual SQL Server vCPU configuration in this insightful session. Valuable tips and tricks will be shared that you can take back to your virtual SQL Servers and immediately apply to your own environments.

Published in: Software
  • Be the first to comment

24 Hours of PASS, Summit Preview Session: Virtual SQL Server CPUs

  1. 1. Virtual CPUs: Right to Ludicrous Speed David Klee, Founder & Chief Architect, Heraflux Technologies
  2. 2. If you require assistance during the session, type your inquiry into the question pane on the right side. Maximize your screen with the zoom button on the top of the presentation window. Please fill in the short evaluation following the session. It will appear in your web browser. Technical Assistance
  3. 3. Thank You to Our Sponsors Quest helps our customers reduce tedious administration tasks so they can focus on the innovation necessary for their businesses to grow. Quest® solutions are scalable, affordable and simple-to-use, and they deliver unmatched efficiency and productivity. Combined with Quest’s invitation to the global community to be a part of its innovation, as well as our firm commitment to ensuring customer satisfaction, Quest will continue to accelerate the delivery of the most comprehensive solutions for Azure cloud management, SaaS, security, workforce mobility and data-driven insight. Melissa Global Intelligence provides data quality and identity resolution tools for SQL Server and .NET to perform the tasks of ensuring new incoming data is in good condition and maintaining data quality over time. Utilizing comprehensive reference datasets, Melissa solutions verify, standardize, dedupe, enrich, geocode and update global contact data including address, name, email and phone data. Since 1985, Melissa has helped businesses of any size improve data management, data governance and business analytics with clean, reliable and actionable data. Melissa is a Registered Microsoft Partner with international offices in the U.K, Germany and India. Nutanix makes IT infrastructure invisible with an enterprise cloud platform that delivers the agility and economics of the public cloud, without sacrificing the security and control of on-premises infrastructure. Whether upgrading existing infrastructure or deploying new environments, Nutanix is the ideal solution for virtualized SQL Server deployments. • Consolidate SQL Server databases and VMs onto a single converged platform • Run Microsoft SQL Server with other critical workloads, without sacrificing performance or reliability • Remove the complexity and reduce the costs of traditional storage • Eliminate planned downtime and protect against unplanned issues to deliver continuous availability • Keep pace with rapidly growing business needs
  4. 4. Access to online training and content Enjoy discounted event rates Join Local Groups and Virtual Groups Get advance notice of member exclusives PASS is a not-for-profit organization which offers year-round learning opportunities to data professionals. Check Out Your Member Benefits Today. www.pass.org Make the Most Out of your PASS Membership
  5. 5. Where Data Professionals Connect, Share, and Learn REGISTER NOW www.PASSsummit.com OCT 31 – NOV 3 SEATTLE WA
  6. 6. Virtual CPUs: Right to Ludicrous Speed David, Klee, Founder & Chief Architect, Heraflux Technologies
  7. 7. HEALTH, CAPACITY, & EFFICENCY Focused on understanding system health, capacity and operations management, and overall efficiency of all things IT. David Klee FOUNDER – HERAFLUX TECHNOLOGIES Enterprise consulting centered on the convergence of infrastructure, data, and cloud DATA INFRASTRUCTURE ARCHITECT Seventeen years of enterprise SQL Server virtualization experience. Virtualized some of the largest SQL Servers in the world./davidaklee /kleegeek /in/davidaklee
  8. 8. DBA Knowledge Gaps VIRTUALIZATION & HARDWARE • What is it? • How do they work together? MODERN CPU ARCHITECTURE • Cores & sockets • NUMA & memory locality HYPERVISOR RESOURCE SCHEDULING • Hypervisor queues, resource overcommit, queue balancing • “Right-sizing” • SQL Server balancing
  9. 9. Virtualization Basics RESOURCES QUEUES • Compute resources in datacenter • CPU • Memory • Network • Storage • Every resource request placed in queue • Queue time variable • Queues not FIFO • Imbalances & overcommitment
  10. 10. Four Main Food Groups CPU Our primary balancing act Memory Mostly non-oversubscribed, so less important Storage Flash storage shifts bottleneck back up the stack Networking Verify throughput but usually not bottleneck to normal operations
  11. 11. Hypervisor Resource Queues Hypervisor CPU Scheduler CPU Execution CPU Scheduling Queue Memory Allocator Mem R / W Mem Allocation Queue Disk Scheduler Disk R / W Disk Scheduling Queue Network Scheduler Network Tran / Rec Network Scheduling Queue VM TASK VM TASK VM TASK VM TASK VM TASK
  12. 12. Physical CPU Architecture
  13. 13. CPU “Package” UNCORE LAST LEVEL CACHE (Shared) CORE L1 CACHE MEMCONTROLLER L2 CACHE CORE L1 CACHE L2 CACHE CORE L1 CACHE L2 CACHE CORE L1 CACHE L2 CACHE
  14. 14. CPU Package Connectors (Img src: https://en.wikipedia.org/wiki/Xeon)
  15. 15. CPU Sockets (Img src: http://bit.ly/2tJU98k)
  16. 16. CPU UMA Architecture CPU 0 CPU 1 CPU 2 CPU 3 CPU 4 CPU 5 CPU 6 CPU 7 Memory Controller (northbridge) I/O Controller RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM
  17. 17. NUMA Nodes (Img src: http://bit.ly/2tJU98k)
  18. 18. CPU NUMA Architecture CPU Package 0 RAM DIMM MemoryController CPU Package 1 MemoryController RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM
  19. 19. Four Socket NUMA RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM CPU Package 0 MemoryController CPU Package 1 MemoryController RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM RAM DIMM CPU Package 3 MemoryController CPU Package 4 MemoryController
  20. 20. Locality (Img src: http://bit.ly/2tJU98k)
  21. 21. Why Does This Matter? SQL Server is NUMA Aware • All layers must be properly aligned to maintain performance • Mis-alignment causes substantial performance impact Hypervisor Can Obfuscate pNUMA • Can create an immediate out-of-balance situation • Degrades performance silently Maximum Performance is Critical • SQL Server is extremely latency sensitive with these layers
  22. 22. Virtual Machine CPUs
  23. 23. VM Admin Perspectives
  24. 24. What You Can See
  25. 25. Determine vCPU Count How Many Do You Need? • “Right-sizing” analysis • Ongoing performance baseline • Size for now, not future • Want target CPU utilization 40-60% during routine business operations • Leave headroom for short-term growth • Resize VM as necessary
  26. 26. Create Consumption Baseline Performance Metric Collection • Third-party utilities • Windows Perfmon • 30-second granularity • hfxte.ch/perfmon – free setup guide • hfxte.ch/perfmonposh – free PoSH to import BLGs into database
  27. 27. Perfmon Counters 0 10 20 30 40 50 60 00:00 00:20 00:40 01:00 01:20 01:40 02:00 02:20 02:40 03:00 03:20 03:40 04:00 04:20 04:40 05:00 05:20 05:40 06:00 06:20 06:40 07:00 07:20 07:40 08:00 08:20 08:40 09:00 09:20 09:40 10:00 10:20 10:40 11:00 11:20 11:40 12:00 12:20 12:40 13:00 13:20 13:40 14:00 14:20 14:40 15:00 15:20 15:40 16:00 16:20 16:40 17:00 17:20 17:40 18:00 18:20 18:40 19:00 19:20 19:40 20:00 20:20 20:40 21:00 21:20 21:40 22:00 22:20 22:40 23:00 23:20 23:40 %CPUConsumption Time of Day (Avg) SQL Server CPU by Core - Five Minute Median (8 Core) CPU00 CPU01 CPU02 CPU03 CPU04 CPU05 CPU06 CPU07
  28. 28. Placement 1x12 CPU / 128GB Socket 1x12 CPU / 128GB Socket VM 1x10 / 64GB vCPUs VM 2x8 / 128GB vCPUs vCPUs
  29. 29. Verify vCPU Presentation MS CoreInfo http://bit.ly/1SKNcWL
  30. 30. CPU Scheduling Pressure
  31. 31. VMHost(2x12)VMHost(2x12) Shared Storage VM (2x8) VM (2x8) VM (2x8) VM (2x8) VM (2x8) VM (2x8) DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB vCPU Overcommit
  32. 32. vCPUTASKEXECUTIONONpCPUS vCPUTASKSUBMISSIONTOQUEUES CPU Scheduling Queueing VM (1x8) VM (2x8) VM (2x6) VM (4x3) VM (16x1) VM Host (2x8) READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE
  33. 33. Scheduling Trouble Measurement Hyper-V – Wait Time Per Dispatch • Measured in nanoseconds • Average value or individual core values • Sample interval (one second) • Avg. over collection interval (X seconds ) • (Metric value / sample interval total nanoseconds) * 100% • = Avg. percent perf loss
  34. 34. Scheduling Trouble Measurement VMware – CPU Ready Time • Measured in milliseconds • Sum total value or individual core values • Fixed 20-second sample interval • (Sum total / # cores / 20000ms) * 100% • = Avg. percent perf loss
  35. 35. SMP vCPU Schedule Balancing vCPUTASKEXECUTIONONpCPUS vCPUTASKSUBMISSIONTOQUEUES VM (1x8) VM Host (2x8) READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE READY TO RUN QUEUE MaxDOP=4
  36. 36. Scheduling Trouble Measurement VMware – Co-Stop • Measured in milliseconds • Sum total value or individual core values • Fixed 20-second sample interval • Look for sustained stretches • No known equivalent on MS Hyper-V
  37. 37. SQL Server
  38. 38. Balanced Harmony 1x12 CPU / 128GB Socket 1x12 CPU / 128GB Socket VM 2x8 / 128GB vCPUs vCPUs MaxDOP = 8 DB BIG QUERY
  39. 39. Remediation Tasks RIGHT-SIZE ALL THE VMs REDUCE VM WORKLOAD • Reduce vCPU allocations (when applicable) • Align vNUMA boundaries • Reduce vCPU queue scheduling • Smaller footprint easier to schedule • Less host CPU scheduling delays • Load balance VM cluster • Remove VM workloads from your host • Resource pools to prioritize workloads
  40. 40. QUESTIONS?
  41. 41. Coming up next! Azure SQL VM - Implementing Basic AG in SQL 2016 STD Kenneth Urena
  42. 42. THANK YOU FOR ATTENDING Follow @sqlpass Share your thoughts with #PASS24HOP & #sqlpass

×