SlideShare a Scribd company logo
1 of 61
vSphere Performance
Best Practices
Peter Boone, VMware, Inc.
INF-VSP1800
#vmworldinf
2
Disclaimer
 This session may contain product features that are
currently under development.
 This session/overview of the new technology represents
no commitment from VMware to deliver these features in
any generally available product.
 Features are subject to change, and must not be included in
contracts, purchase orders, or sales agreements of any kind.
 Technical feasibility and market demand will affect final delivery.
 Pricing and packaging for any new technologies or features
discussed or presented have not been determined.
3
Global Support Services and Customer Advocacy
Bangalore, India
Tokyo, Japan
Cork, IrelandBurlington, Canada
Palo Alto, CA Broomfield, CO
Support offices
Local language support
Spanish, Portuguese, French, German, Japanese, Chinese
Global Coverage
24x7, 365 days/year
6 Support Centers
1000+ Support
Engineers
Follow-the-sun
Support for
Severity 1 Issues
Support Relationships
with 100% of the
Fortune 100;
99% of Fortune 500
4
Customer Support Day Events
Coming to a location near you: sharing of VMware best practices!
 Support Days are a collaboration between VMware Support, Sales
and customers – you learn directly from the experts
 Topics are driven by
customer input, and
typically include:
• Best practices
• Tips/tricks
• Top issues
• Product roadmaps/demos
• Certification offerings
http://www.vmware.com/go/supportdays
5
Overview
 What a performance problem sounds like:
• “My VM is running slow and I don’t know what to do!”
• “I tried adding more memory and CPUs but the problem got worse!”`
• “My VM is slow on one host but fast on another!”
 What to look for? Where to start?
 We will explore some of the most common performance-related
issues that our support centers receive cases for
6
A word about performance….
 Troubleshooting methodology must define:
• How to find root cause
• How to fix the problem
 Must answer these questions:
1. How do we know when we are done?
2. Where do we start looking for problems?
3. How do we know what to look for to identify a problem?
4. How do we find the root-cause of a problem we have identified?
5. What do we change to fix the root-cause?
6. Where do we look next if no problem is found?
7
Agenda
 Benchmarking & Tools
 Best Practices and Troubleshooting
 The 4 “food groups”
• Memory
• CPU
• Storage
• Network
© 2012 VMware Inc. All rights reserved
BENCHMARKING & TOOLS
9
Benchmarking
 Consistent and reproducible results
 Important to have base level of acceptable performance
• Expectation vs. Acceptable
 Determine baseline of performance prior to deployment
• Benchmark on a physical system if applicable
 Avoid subjective metrics, stay quantitative
• “The system seems slower”
• “This worked better last year”
10
Benchmarking
 Benchmarking should be done at the application layer
• Use application-specific benchmarking tools and load generators
• Check with the application vendor
 Isolate variables, benchmark optimum situation before
introducing load
 Understand dependencies
• Human interaction
• Other “food groups”
• Compare apples-to-apples
11
 Aggregates thousands of metrics into Workload, Capacity,
Health scores
 Self-learns “normal” conditions using patented analytics
 Smart alerts of impending performance and capacity degradation
 Identifies potential performance problems before they start
Slide 11
Tools – vCenter Operations
12
Tools – vCenter Operations
Slide 12
13
Tools – esxtop
 Valuable tool built in to vSphere hosts
 View or capture real-time data
• View or playback data later
• Import data in 3rd party tools
 vSphere Client performance graphs get their data from esxtop data
• Presentation/unit may be different (e.g. %RDY)
 Little overhead impact on the host
© 2012 VMware Inc. All rights reserved
MEMORY
15
Memory – Allocation
 A VM’s RAM is not necessarily physical RAM
• vRAM + overhead = maximum physical RAM
 Whether or not that memory is physical or virtual depends on…
• Host configuration
• Shares
• Limits
• Reservations
• Host load
• Idle/Active VMs
16
Memory – Overhead
Source: vSphere 5.0 Resource Management Guide
17
Memory – Host Memory Management
Occurs when memory is under contention
 Transparent Page Sharing
 Ballooning
 Compression
 Swapping
18
Memory – Transparent Page Sharing
19
Memory – Ballooning
20
Memory – Compression
21
Memory – Swapping
22
Memory – Swapping
23
Memory – VM Resource Allocation
24
Memory – Resource Pool Allocation
25
Memory – Ballooning vs. Swapping
 Ballooning is better than swapping
 Guest can surrender unused/free pages
 Guest chooses what to swap, can avoid swapping “hot” pages
 Idle memory tax uses ballooning
26
Memory – Rightsizing
 Generally, it is better to OVER-commit than UNDER-commit
 If the running VMs are consuming too much host/pool memory…
• Some VMs may not get physical memory
• Ballooning or host swapping
• Higher disk IO
• All VMs slow down
27
Memory – Rightsizing
 If a VM has too little vRAM…
• Applications suffer from lack of RAM
• The guest OS swaps
• Increased disk traffic, thrashing
• SAN slow down as a result of increased disk traffic
 If a VM has too much vRAM…
• Higher overhead memory
• Possible decreased failover capacity
• Longer vMotion time
• Larger VSWP file
• Wasted resources
28
Memory – Troubleshooting
 Wrong resource allocation
 May not notice a limit, e.g. VM or template with a limit gets cloned
 Custom share values
 Ballooning or swapping at the host level
• Ballooning is a warning sign, not a problem
• Swapping is a performance issue if seen over an extended period
 Swapping/paging at the guest level
• Under-provisioned guest memory
 Missing balloon driver (Tools)
29
Memory – Best Practices
 Avoid high active host memory over-commitment
• No host swapping occurs when total memory demand is less than the physical
memory (Assuming no limits)
 Right-size guest memory
• Avoid guest OS swapping
 Ensure there is enough vRAM to cover demand peaks
 Use a fully automated DRS cluster
• Test that vMotion works
• Use Resource Pools with High/Normal/Low shares
• Avoid using custom shares
© 2012 VMware Inc. All rights reserved
CPU
31
CPU – Overview
 Raw processing power of a given host or VM
• Hosts provide CPU resources
• VMs and Resource Pools consume CPU resources
 CPU cores/threads need to be shared between VMs
 Fair scheduling vCPU time
• Hardware interrupts for a VM
• Parallel processing for SMP VMs
• I/O
32
CPU – esxtop
33
CPU – esxtop
 Interpret the esxtop columns correctly
 %USED – Physical CPU usage
 %SYS – Percentage of time in the VMkernel
 %RUN – Percentage of total scheduled time to run
 %WAIT – Percentage of time in blocked or busy wait states
 %IDLE – %WAIT- %IDLE can be used to estimate I/O wait time
34
CPU – Performance Overhead & Utilization
 Different workloads have different overhead costs (%SYS) even for
the same utilization (%USED)
 CPU virtualization adds varying amounts of system overhead
• Direct execution vs. privileged execution
• Non-paravirtual adapters vs. emulated adaptors
• Virtual hardware (Interrupts!)
• Network and storage I/O
35
CPU – vSMP
 Relaxed Co-Scheduling: vCPUs can run out-of-sync
 Idle vCPUs incur a scheduling penalty
• configure only as many vCPUs as needed
• Impose unnecessary scheduling constraints
 Use Uniprocessor VMs for single-threaded applications
36
CPU– Scheduling
Over committing physical CPUs
VMkernel CPU Scheduler
37
CPU– Scheduling
Over committing physical CPUs
VMkernel CPU Scheduler
X X
38
CPU– Scheduling
Over committing physical CPUs
VMkernel CPU Scheduler
X XX X
39
CPU – Ready Time
 The percentage of time that a vCPU is ready to execute, but waiting
for physical CPU time
 Does not necessarily indicate a problem
• Indicates possible CPU contention or limits
40
CPU – NUMA nodes
 Non-Uniform Memory Access system architecture
 Each node consists of CPU cores and memory
 A CPU core in one NUMA node can access memory in another
node, but at a small performance cost
NUMA node 1 NUMA node 2
41
CPU – NUMA nodes
 The VMkernel will try to keep a VM’s vCPUs local to its memory
• Internal NUMA migrations can occur to balance load
 Manual CPU affinity can affect performance
• vCPUs inadvertently spread across NUMA nodes
• Not possible with fully automated DRS
 VMs with more vCPUs than cores available in a single NUMA node
may see decreased performance
42
CPU – Troubleshooting
 vCPU to pCPU over allocation
• HyperThreading does not double CPU capacity!
 Limits or too many reservations
• can create artificial limits.
 Expecting the same consolidation ratios with different workloads
• Virtualizing “easy” systems first, then expanding to heavier systems
• Compare Apples to Apples
• Frequency, turbo, cache sizes, cache sharing, core count, instruction set…
43
CPU – Best Practices
 Right-size vSMP VMs
 Keep heavy-hitters separated
• Fully automated DRS should do this for you
• Use anti-affinity rules if necessary
 Use a fully automated DRS cluster
• Test that vMotion works
• Use Resource Pools with High/Normal/Low shares
• Avoid using custom shares
© 2012 VMware Inc. All rights reserved
STORAGE
45
Storage – esxtop Counters
 Different esxtop storage views
• Adapter (d)
• VM (v)
• Disk Device (u)
 Key Fields:
• DAVG + KAVG = GAVG
• QUED/USD – Command Queue Depth
• CMDS/s – Commands Per Second
• MBREADS/s
• MBWRTN/s
46
Storage – Troubleshooting with esxtop
 High DAVG: issue beyond the adapter
• bad/overloaded zoning, over utilized storage processors, too few platters in the
RAID set, etc.
 High KAVG: issue in the kernel storage stack
• Driver issue
• Full queue
 Aborts: GAVG exceeding 5000 ms
• Command will be repeated, storage delay for the VM
47
Storage – Benchmarking with iometer
48
Storage – Storage I/O Control
 Allows the use of Shares per VMDK
 Throttling occurs when datastore reaches latency threshold
• Higher share VMDKs perform IO first
 vCenter monitors latency across all hosts
• Not effective if datastore shared with other vCenters
49
Storage – Storage DRS
 Datastore clusters
• Maintenance mode
• Anti-affinity rules
 vCenter monitors for latency and disk space
• Migrate VMDKs for better performance or utilization
 Not effective with automated tiering SANs
• Check HCL to confirm these features are compatible
50
Storage – Troubleshooting
 Snapshots
 Excessive traffic down one HBA / Switch / SP can cause latency
• Consider using Round Robin in conjunction with ALUA
• Always be paranoid when it comes to monitoring storage I/O
 Consider your I/O patterns
• Peak time for storage IO?
• Virus scans, database maintenance, user logins
 Always consult with array vendor
• They know the best practices for their array!
51
Storage – Best Practices
 Use different tiers of storage for different VM workloads
• Slower storage for OS VMDKs
• Faster storage for databases or other high-IO applications
 Use the Paravirtual SCSI adapter
• Reduced overhead, higher throughput
 Use path balancing where possible, either through plugins
(Powerpath) / Round Robin and ALUA, if supported.
 Use Storage DRS with SIOC
• Balance for both free space and latency
• Simplified datastore management
© 2012 VMware Inc. All rights reserved
NETWORK
53
Network – Load Balancing
 Load balancing defines which uplink is used
• Route based on Port ID
• Route based on IP hash
• Route based on MAC hash
• Route based on NIC load
 Probability of high-bandwidth VMs being on the same physical NIC
 Traffic will stay on elected uplink until an event occurs
• NIC link state change, adding/removing NIC from a team, beacon probe
timeout…
54
Network – Troubleshooting
 Check counters for NICs and VMs
• Network load imbalance
• 10 Gbps NICs can incur a significant CPU load when running at 100%
 Ensure hardware supports TSO
• Use latest drivers and firmware for your NIC on the host
 For multi-tier VM applications, use DRS affinity rules to keep VMs
on same host
• Same vSwitch / VLAN, rules out physical network
 If using Jumbo Frames, ensure it is enabled end-to-end
55
Network – Best Practices
 Use the vmxnet3 virtual adapter
• Less CPU overhead
• 10 Gbps connection to vSwitch
 Use the latest driver/firmware for the NICs on the host
 Use network shares
• Requires Virtual Distributed Switch 4.1
 Isolate vMotion and iSCSI traffic from regular VM traffic
• Separate vSwitches with dedicated NIC(s)
• Most applicable with Gigabit NICs
56
In conclusion…
57
Key Takeaways – Performance Best Practices
 Understand your environment
• Hardware, storage, networking
• VMs & applications
 Advanced configuration values do not need to be tweaked or
modified
• In almost all situations
 Use fully automated DRS
 Use Paravirtual virtual hardware
58
Important Links
59
Important Links
FILL OUT
A SURVEY
EVERY COMPLETE SURVEY
IS ENTERED INTO
DRAWING FOR A
$25 VMWARE COMPANY
STORE GIFT CERTIFICATE
vSphere Performance
Best Practices
Peter Boone, VMware, Inc.
INF-VSP1800
#vmworldinf

More Related Content

Viewers also liked

The Boy in the Dress Hot Seat Game
The Boy in the Dress Hot Seat Game The Boy in the Dress Hot Seat Game
The Boy in the Dress Hot Seat Game Maite Zumalde
 
Исследование спроса на ипотечное кредитование
Исследование спроса на ипотечное кредитованиеИсследование спроса на ипотечное кредитование
Исследование спроса на ипотечное кредитованиеNAFI Analytical Center
 
Albinism Initiative
Albinism InitiativeAlbinism Initiative
Albinism Initiativemsantu
 
RHBC 182: Christianity's Impact on Education
RHBC 182: Christianity's Impact on EducationRHBC 182: Christianity's Impact on Education
RHBC 182: Christianity's Impact on Educationrhbc
 
How to remove uncheckit ads adware
How to remove uncheckit ads adwareHow to remove uncheckit ads adware
How to remove uncheckit ads adwareharoNaroum
 
CollegeRecord.PDF
CollegeRecord.PDFCollegeRecord.PDF
CollegeRecord.PDFJohn Aucoin
 
Top 8 concrete truck driver resume samples
Top 8 concrete truck driver resume samplesTop 8 concrete truck driver resume samples
Top 8 concrete truck driver resume samplesLadyGaGa789
 
Wisdom_Ian_Personal_Persona_Project_1507
Wisdom_Ian_Personal_Persona_Project_1507Wisdom_Ian_Personal_Persona_Project_1507
Wisdom_Ian_Personal_Persona_Project_1507Ian Wisdom
 
mi primera clase slideshare
mi primera clase slidesharemi primera clase slideshare
mi primera clase slideshareGladis Mozo
 
Asiamoney-FX regulations stupefy Latam corp RMB adoption
Asiamoney-FX regulations stupefy Latam corp RMB adoptionAsiamoney-FX regulations stupefy Latam corp RMB adoption
Asiamoney-FX regulations stupefy Latam corp RMB adoptionBrenda Torres
 
Carmen_Guilarte_Resume_2016
Carmen_Guilarte_Resume_2016Carmen_Guilarte_Resume_2016
Carmen_Guilarte_Resume_2016Carmen Guilarte
 

Viewers also liked (12)

The Boy in the Dress Hot Seat Game
The Boy in the Dress Hot Seat Game The Boy in the Dress Hot Seat Game
The Boy in the Dress Hot Seat Game
 
Исследование спроса на ипотечное кредитование
Исследование спроса на ипотечное кредитованиеИсследование спроса на ипотечное кредитование
Исследование спроса на ипотечное кредитование
 
Company Profile
Company ProfileCompany Profile
Company Profile
 
Albinism Initiative
Albinism InitiativeAlbinism Initiative
Albinism Initiative
 
RHBC 182: Christianity's Impact on Education
RHBC 182: Christianity's Impact on EducationRHBC 182: Christianity's Impact on Education
RHBC 182: Christianity's Impact on Education
 
How to remove uncheckit ads adware
How to remove uncheckit ads adwareHow to remove uncheckit ads adware
How to remove uncheckit ads adware
 
CollegeRecord.PDF
CollegeRecord.PDFCollegeRecord.PDF
CollegeRecord.PDF
 
Top 8 concrete truck driver resume samples
Top 8 concrete truck driver resume samplesTop 8 concrete truck driver resume samples
Top 8 concrete truck driver resume samples
 
Wisdom_Ian_Personal_Persona_Project_1507
Wisdom_Ian_Personal_Persona_Project_1507Wisdom_Ian_Personal_Persona_Project_1507
Wisdom_Ian_Personal_Persona_Project_1507
 
mi primera clase slideshare
mi primera clase slidesharemi primera clase slideshare
mi primera clase slideshare
 
Asiamoney-FX regulations stupefy Latam corp RMB adoption
Asiamoney-FX regulations stupefy Latam corp RMB adoptionAsiamoney-FX regulations stupefy Latam corp RMB adoption
Asiamoney-FX regulations stupefy Latam corp RMB adoption
 
Carmen_Guilarte_Resume_2016
Carmen_Guilarte_Resume_2016Carmen_Guilarte_Resume_2016
Carmen_Guilarte_Resume_2016
 

More from solarisyourep

Presentation a new era in it
Presentation   a new era in itPresentation   a new era in it
Presentation a new era in itsolarisyourep
 
Presentation a vision for user centric computing
Presentation   a vision for user centric computingPresentation   a vision for user centric computing
Presentation a vision for user centric computingsolarisyourep
 
Presentation advanced management – the road ahead
Presentation   advanced management – the road aheadPresentation   advanced management – the road ahead
Presentation advanced management – the road aheadsolarisyourep
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructuresolarisyourep
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big datasolarisyourep
 
Presentation avoiding the 19 biggest ha & drs configuration mistakes
Presentation   avoiding the 19 biggest ha & drs configuration mistakesPresentation   avoiding the 19 biggest ha & drs configuration mistakes
Presentation avoiding the 19 biggest ha & drs configuration mistakessolarisyourep
 
Presentation blade center foundation for cloud
Presentation   blade center foundation for cloudPresentation   blade center foundation for cloud
Presentation blade center foundation for cloudsolarisyourep
 
Presentation building and running your private cloud
Presentation   building and running your private cloudPresentation   building and running your private cloud
Presentation building and running your private cloudsolarisyourep
 
Presentation building your cloud with v mware
Presentation   building your cloud with v mwarePresentation   building your cloud with v mware
Presentation building your cloud with v mwaresolarisyourep
 
Presentation business critical applications in a virtual env
Presentation   business critical applications in a virtual envPresentation   business critical applications in a virtual env
Presentation business critical applications in a virtual envsolarisyourep
 
Presentation cim1309 v cat 3.0 operating a v-mware cloud
Presentation   cim1309 v cat 3.0 operating a v-mware cloudPresentation   cim1309 v cat 3.0 operating a v-mware cloud
Presentation cim1309 v cat 3.0 operating a v-mware cloudsolarisyourep
 
Presentation cisco intelligent automation complementing and extending v mwa...
Presentation   cisco intelligent automation complementing and extending v mwa...Presentation   cisco intelligent automation complementing and extending v mwa...
Presentation cisco intelligent automation complementing and extending v mwa...solarisyourep
 
Presentation cisco vxi–optimized infrastructure for scaling v mware view wi...
Presentation   cisco vxi–optimized infrastructure for scaling v mware view wi...Presentation   cisco vxi–optimized infrastructure for scaling v mware view wi...
Presentation cisco vxi–optimized infrastructure for scaling v mware view wi...solarisyourep
 
Presentation cloud infrastructure and management – from v sphere to vcloud ...
Presentation   cloud infrastructure and management – from v sphere to vcloud ...Presentation   cloud infrastructure and management – from v sphere to vcloud ...
Presentation cloud infrastructure and management – from v sphere to vcloud ...solarisyourep
 
Presentation cloud infrastructure launch – what’s new
Presentation   cloud infrastructure launch – what’s newPresentation   cloud infrastructure launch – what’s new
Presentation cloud infrastructure launch – what’s newsolarisyourep
 
Presentation cloud meets big
Presentation   cloud meets bigPresentation   cloud meets big
Presentation cloud meets bigsolarisyourep
 
Presentation consuming a cloud
Presentation   consuming a cloudPresentation   consuming a cloud
Presentation consuming a cloudsolarisyourep
 
Presentation desktops for the cloud the view rollout
Presentation   desktops for the cloud the view rolloutPresentation   desktops for the cloud the view rollout
Presentation desktops for the cloud the view rolloutsolarisyourep
 
Presentation disaster recovery in virtualization and cloud
Presentation   disaster recovery in virtualization and cloudPresentation   disaster recovery in virtualization and cloud
Presentation disaster recovery in virtualization and cloudsolarisyourep
 
Presentation drs advanced concepts, best practices and future directions
Presentation   drs advanced concepts, best practices and future directionsPresentation   drs advanced concepts, best practices and future directions
Presentation drs advanced concepts, best practices and future directionssolarisyourep
 

More from solarisyourep (20)

Presentation a new era in it
Presentation   a new era in itPresentation   a new era in it
Presentation a new era in it
 
Presentation a vision for user centric computing
Presentation   a vision for user centric computingPresentation   a vision for user centric computing
Presentation a vision for user centric computing
 
Presentation advanced management – the road ahead
Presentation   advanced management – the road aheadPresentation   advanced management – the road ahead
Presentation advanced management – the road ahead
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 
Presentation avoiding the 19 biggest ha & drs configuration mistakes
Presentation   avoiding the 19 biggest ha & drs configuration mistakesPresentation   avoiding the 19 biggest ha & drs configuration mistakes
Presentation avoiding the 19 biggest ha & drs configuration mistakes
 
Presentation blade center foundation for cloud
Presentation   blade center foundation for cloudPresentation   blade center foundation for cloud
Presentation blade center foundation for cloud
 
Presentation building and running your private cloud
Presentation   building and running your private cloudPresentation   building and running your private cloud
Presentation building and running your private cloud
 
Presentation building your cloud with v mware
Presentation   building your cloud with v mwarePresentation   building your cloud with v mware
Presentation building your cloud with v mware
 
Presentation business critical applications in a virtual env
Presentation   business critical applications in a virtual envPresentation   business critical applications in a virtual env
Presentation business critical applications in a virtual env
 
Presentation cim1309 v cat 3.0 operating a v-mware cloud
Presentation   cim1309 v cat 3.0 operating a v-mware cloudPresentation   cim1309 v cat 3.0 operating a v-mware cloud
Presentation cim1309 v cat 3.0 operating a v-mware cloud
 
Presentation cisco intelligent automation complementing and extending v mwa...
Presentation   cisco intelligent automation complementing and extending v mwa...Presentation   cisco intelligent automation complementing and extending v mwa...
Presentation cisco intelligent automation complementing and extending v mwa...
 
Presentation cisco vxi–optimized infrastructure for scaling v mware view wi...
Presentation   cisco vxi–optimized infrastructure for scaling v mware view wi...Presentation   cisco vxi–optimized infrastructure for scaling v mware view wi...
Presentation cisco vxi–optimized infrastructure for scaling v mware view wi...
 
Presentation cloud infrastructure and management – from v sphere to vcloud ...
Presentation   cloud infrastructure and management – from v sphere to vcloud ...Presentation   cloud infrastructure and management – from v sphere to vcloud ...
Presentation cloud infrastructure and management – from v sphere to vcloud ...
 
Presentation cloud infrastructure launch – what’s new
Presentation   cloud infrastructure launch – what’s newPresentation   cloud infrastructure launch – what’s new
Presentation cloud infrastructure launch – what’s new
 
Presentation cloud meets big
Presentation   cloud meets bigPresentation   cloud meets big
Presentation cloud meets big
 
Presentation consuming a cloud
Presentation   consuming a cloudPresentation   consuming a cloud
Presentation consuming a cloud
 
Presentation desktops for the cloud the view rollout
Presentation   desktops for the cloud the view rolloutPresentation   desktops for the cloud the view rollout
Presentation desktops for the cloud the view rollout
 
Presentation disaster recovery in virtualization and cloud
Presentation   disaster recovery in virtualization and cloudPresentation   disaster recovery in virtualization and cloud
Presentation disaster recovery in virtualization and cloud
 
Presentation drs advanced concepts, best practices and future directions
Presentation   drs advanced concepts, best practices and future directionsPresentation   drs advanced concepts, best practices and future directions
Presentation drs advanced concepts, best practices and future directions
 

Recently uploaded

Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideStefan Dietze
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimaginedpanagenda
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jNeo4j
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 

Recently uploaded (20)

Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Your enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4jYour enemies use GenAI too - staying ahead of fraud with Neo4j
Your enemies use GenAI too - staying ahead of fraud with Neo4j
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 

Presentation v sphere performance best practices

  • 1. vSphere Performance Best Practices Peter Boone, VMware, Inc. INF-VSP1800 #vmworldinf
  • 2. 2 Disclaimer  This session may contain product features that are currently under development.  This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product.  Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.  Technical feasibility and market demand will affect final delivery.  Pricing and packaging for any new technologies or features discussed or presented have not been determined.
  • 3. 3 Global Support Services and Customer Advocacy Bangalore, India Tokyo, Japan Cork, IrelandBurlington, Canada Palo Alto, CA Broomfield, CO Support offices Local language support Spanish, Portuguese, French, German, Japanese, Chinese Global Coverage 24x7, 365 days/year 6 Support Centers 1000+ Support Engineers Follow-the-sun Support for Severity 1 Issues Support Relationships with 100% of the Fortune 100; 99% of Fortune 500
  • 4. 4 Customer Support Day Events Coming to a location near you: sharing of VMware best practices!  Support Days are a collaboration between VMware Support, Sales and customers – you learn directly from the experts  Topics are driven by customer input, and typically include: • Best practices • Tips/tricks • Top issues • Product roadmaps/demos • Certification offerings http://www.vmware.com/go/supportdays
  • 5. 5 Overview  What a performance problem sounds like: • “My VM is running slow and I don’t know what to do!” • “I tried adding more memory and CPUs but the problem got worse!”` • “My VM is slow on one host but fast on another!”  What to look for? Where to start?  We will explore some of the most common performance-related issues that our support centers receive cases for
  • 6. 6 A word about performance….  Troubleshooting methodology must define: • How to find root cause • How to fix the problem  Must answer these questions: 1. How do we know when we are done? 2. Where do we start looking for problems? 3. How do we know what to look for to identify a problem? 4. How do we find the root-cause of a problem we have identified? 5. What do we change to fix the root-cause? 6. Where do we look next if no problem is found?
  • 7. 7 Agenda  Benchmarking & Tools  Best Practices and Troubleshooting  The 4 “food groups” • Memory • CPU • Storage • Network
  • 8. © 2012 VMware Inc. All rights reserved BENCHMARKING & TOOLS
  • 9. 9 Benchmarking  Consistent and reproducible results  Important to have base level of acceptable performance • Expectation vs. Acceptable  Determine baseline of performance prior to deployment • Benchmark on a physical system if applicable  Avoid subjective metrics, stay quantitative • “The system seems slower” • “This worked better last year”
  • 10. 10 Benchmarking  Benchmarking should be done at the application layer • Use application-specific benchmarking tools and load generators • Check with the application vendor  Isolate variables, benchmark optimum situation before introducing load  Understand dependencies • Human interaction • Other “food groups” • Compare apples-to-apples
  • 11. 11  Aggregates thousands of metrics into Workload, Capacity, Health scores  Self-learns “normal” conditions using patented analytics  Smart alerts of impending performance and capacity degradation  Identifies potential performance problems before they start Slide 11 Tools – vCenter Operations
  • 12. 12 Tools – vCenter Operations Slide 12
  • 13. 13 Tools – esxtop  Valuable tool built in to vSphere hosts  View or capture real-time data • View or playback data later • Import data in 3rd party tools  vSphere Client performance graphs get their data from esxtop data • Presentation/unit may be different (e.g. %RDY)  Little overhead impact on the host
  • 14. © 2012 VMware Inc. All rights reserved MEMORY
  • 15. 15 Memory – Allocation  A VM’s RAM is not necessarily physical RAM • vRAM + overhead = maximum physical RAM  Whether or not that memory is physical or virtual depends on… • Host configuration • Shares • Limits • Reservations • Host load • Idle/Active VMs
  • 16. 16 Memory – Overhead Source: vSphere 5.0 Resource Management Guide
  • 17. 17 Memory – Host Memory Management Occurs when memory is under contention  Transparent Page Sharing  Ballooning  Compression  Swapping
  • 23. 23 Memory – VM Resource Allocation
  • 24. 24 Memory – Resource Pool Allocation
  • 25. 25 Memory – Ballooning vs. Swapping  Ballooning is better than swapping  Guest can surrender unused/free pages  Guest chooses what to swap, can avoid swapping “hot” pages  Idle memory tax uses ballooning
  • 26. 26 Memory – Rightsizing  Generally, it is better to OVER-commit than UNDER-commit  If the running VMs are consuming too much host/pool memory… • Some VMs may not get physical memory • Ballooning or host swapping • Higher disk IO • All VMs slow down
  • 27. 27 Memory – Rightsizing  If a VM has too little vRAM… • Applications suffer from lack of RAM • The guest OS swaps • Increased disk traffic, thrashing • SAN slow down as a result of increased disk traffic  If a VM has too much vRAM… • Higher overhead memory • Possible decreased failover capacity • Longer vMotion time • Larger VSWP file • Wasted resources
  • 28. 28 Memory – Troubleshooting  Wrong resource allocation  May not notice a limit, e.g. VM or template with a limit gets cloned  Custom share values  Ballooning or swapping at the host level • Ballooning is a warning sign, not a problem • Swapping is a performance issue if seen over an extended period  Swapping/paging at the guest level • Under-provisioned guest memory  Missing balloon driver (Tools)
  • 29. 29 Memory – Best Practices  Avoid high active host memory over-commitment • No host swapping occurs when total memory demand is less than the physical memory (Assuming no limits)  Right-size guest memory • Avoid guest OS swapping  Ensure there is enough vRAM to cover demand peaks  Use a fully automated DRS cluster • Test that vMotion works • Use Resource Pools with High/Normal/Low shares • Avoid using custom shares
  • 30. © 2012 VMware Inc. All rights reserved CPU
  • 31. 31 CPU – Overview  Raw processing power of a given host or VM • Hosts provide CPU resources • VMs and Resource Pools consume CPU resources  CPU cores/threads need to be shared between VMs  Fair scheduling vCPU time • Hardware interrupts for a VM • Parallel processing for SMP VMs • I/O
  • 33. 33 CPU – esxtop  Interpret the esxtop columns correctly  %USED – Physical CPU usage  %SYS – Percentage of time in the VMkernel  %RUN – Percentage of total scheduled time to run  %WAIT – Percentage of time in blocked or busy wait states  %IDLE – %WAIT- %IDLE can be used to estimate I/O wait time
  • 34. 34 CPU – Performance Overhead & Utilization  Different workloads have different overhead costs (%SYS) even for the same utilization (%USED)  CPU virtualization adds varying amounts of system overhead • Direct execution vs. privileged execution • Non-paravirtual adapters vs. emulated adaptors • Virtual hardware (Interrupts!) • Network and storage I/O
  • 35. 35 CPU – vSMP  Relaxed Co-Scheduling: vCPUs can run out-of-sync  Idle vCPUs incur a scheduling penalty • configure only as many vCPUs as needed • Impose unnecessary scheduling constraints  Use Uniprocessor VMs for single-threaded applications
  • 36. 36 CPU– Scheduling Over committing physical CPUs VMkernel CPU Scheduler
  • 37. 37 CPU– Scheduling Over committing physical CPUs VMkernel CPU Scheduler X X
  • 38. 38 CPU– Scheduling Over committing physical CPUs VMkernel CPU Scheduler X XX X
  • 39. 39 CPU – Ready Time  The percentage of time that a vCPU is ready to execute, but waiting for physical CPU time  Does not necessarily indicate a problem • Indicates possible CPU contention or limits
  • 40. 40 CPU – NUMA nodes  Non-Uniform Memory Access system architecture  Each node consists of CPU cores and memory  A CPU core in one NUMA node can access memory in another node, but at a small performance cost NUMA node 1 NUMA node 2
  • 41. 41 CPU – NUMA nodes  The VMkernel will try to keep a VM’s vCPUs local to its memory • Internal NUMA migrations can occur to balance load  Manual CPU affinity can affect performance • vCPUs inadvertently spread across NUMA nodes • Not possible with fully automated DRS  VMs with more vCPUs than cores available in a single NUMA node may see decreased performance
  • 42. 42 CPU – Troubleshooting  vCPU to pCPU over allocation • HyperThreading does not double CPU capacity!  Limits or too many reservations • can create artificial limits.  Expecting the same consolidation ratios with different workloads • Virtualizing “easy” systems first, then expanding to heavier systems • Compare Apples to Apples • Frequency, turbo, cache sizes, cache sharing, core count, instruction set…
  • 43. 43 CPU – Best Practices  Right-size vSMP VMs  Keep heavy-hitters separated • Fully automated DRS should do this for you • Use anti-affinity rules if necessary  Use a fully automated DRS cluster • Test that vMotion works • Use Resource Pools with High/Normal/Low shares • Avoid using custom shares
  • 44. © 2012 VMware Inc. All rights reserved STORAGE
  • 45. 45 Storage – esxtop Counters  Different esxtop storage views • Adapter (d) • VM (v) • Disk Device (u)  Key Fields: • DAVG + KAVG = GAVG • QUED/USD – Command Queue Depth • CMDS/s – Commands Per Second • MBREADS/s • MBWRTN/s
  • 46. 46 Storage – Troubleshooting with esxtop  High DAVG: issue beyond the adapter • bad/overloaded zoning, over utilized storage processors, too few platters in the RAID set, etc.  High KAVG: issue in the kernel storage stack • Driver issue • Full queue  Aborts: GAVG exceeding 5000 ms • Command will be repeated, storage delay for the VM
  • 48. 48 Storage – Storage I/O Control  Allows the use of Shares per VMDK  Throttling occurs when datastore reaches latency threshold • Higher share VMDKs perform IO first  vCenter monitors latency across all hosts • Not effective if datastore shared with other vCenters
  • 49. 49 Storage – Storage DRS  Datastore clusters • Maintenance mode • Anti-affinity rules  vCenter monitors for latency and disk space • Migrate VMDKs for better performance or utilization  Not effective with automated tiering SANs • Check HCL to confirm these features are compatible
  • 50. 50 Storage – Troubleshooting  Snapshots  Excessive traffic down one HBA / Switch / SP can cause latency • Consider using Round Robin in conjunction with ALUA • Always be paranoid when it comes to monitoring storage I/O  Consider your I/O patterns • Peak time for storage IO? • Virus scans, database maintenance, user logins  Always consult with array vendor • They know the best practices for their array!
  • 51. 51 Storage – Best Practices  Use different tiers of storage for different VM workloads • Slower storage for OS VMDKs • Faster storage for databases or other high-IO applications  Use the Paravirtual SCSI adapter • Reduced overhead, higher throughput  Use path balancing where possible, either through plugins (Powerpath) / Round Robin and ALUA, if supported.  Use Storage DRS with SIOC • Balance for both free space and latency • Simplified datastore management
  • 52. © 2012 VMware Inc. All rights reserved NETWORK
  • 53. 53 Network – Load Balancing  Load balancing defines which uplink is used • Route based on Port ID • Route based on IP hash • Route based on MAC hash • Route based on NIC load  Probability of high-bandwidth VMs being on the same physical NIC  Traffic will stay on elected uplink until an event occurs • NIC link state change, adding/removing NIC from a team, beacon probe timeout…
  • 54. 54 Network – Troubleshooting  Check counters for NICs and VMs • Network load imbalance • 10 Gbps NICs can incur a significant CPU load when running at 100%  Ensure hardware supports TSO • Use latest drivers and firmware for your NIC on the host  For multi-tier VM applications, use DRS affinity rules to keep VMs on same host • Same vSwitch / VLAN, rules out physical network  If using Jumbo Frames, ensure it is enabled end-to-end
  • 55. 55 Network – Best Practices  Use the vmxnet3 virtual adapter • Less CPU overhead • 10 Gbps connection to vSwitch  Use the latest driver/firmware for the NICs on the host  Use network shares • Requires Virtual Distributed Switch 4.1  Isolate vMotion and iSCSI traffic from regular VM traffic • Separate vSwitches with dedicated NIC(s) • Most applicable with Gigabit NICs
  • 57. 57 Key Takeaways – Performance Best Practices  Understand your environment • Hardware, storage, networking • VMs & applications  Advanced configuration values do not need to be tweaked or modified • In almost all situations  Use fully automated DRS  Use Paravirtual virtual hardware
  • 60. FILL OUT A SURVEY EVERY COMPLETE SURVEY IS ENTERED INTO DRAWING FOR A $25 VMWARE COMPANY STORE GIFT CERTIFICATE
  • 61. vSphere Performance Best Practices Peter Boone, VMware, Inc. INF-VSP1800 #vmworldinf