VMworld 2013: Extreme Performance Series: vCenter of the Universe

2,255 views
2,059 views

Published on

VMworld 2013

Justin King, VMware
Ravi Soundararajan, VMware

Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,255
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
40
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

VMworld 2013: Extreme Performance Series: vCenter of the Universe

  1. 1. Extreme Performance Series: vCenter of the Universe Justin King, VMware Ravi Soundararajan, VMware VSVC5234 #VSVC5234
  2. 2. 2 Goals  Help you understand vCenter Architecture  Help you use this knowledge to guide vCenter deployment
  3. 3. 3 vCenter Deployment Options  One vCenter  Many vCenters  1 vCenter per site  Multiple vCenters using linked mode within a single site  Multiple vCenters using linked mode across sites  …
  4. 4. 4 Agenda  Introduction   vCenter Architectural Deep Dive  Common Questions  Multiple vCenter deployment strategies  Conclusion
  5. 5. 5 For Most of You, This Is vCenter C# clients API clients C# clients API clients vpxd DB vCenter server
  6. 6. 6 However, This is Approximately vCenter (We Will Dissect This…) ESXi + HostD + VPXA STORAGE NETWORK VPXD DB App Server Health SRS vSphere Web Clients VI Clients Update Manager Converter AD API Clients … Java Inv Serv … vCenter server SSO PBSM Log vctomcat
  7. 7. 7 Understanding vCenter Control Flow: Web Client Login App Server vSphere Web Clients vCenter server 1. Login AD SSO2. SSO Authenticates 3. After user is authenticated, user has access to all providers registered with SSO (e.g., vCenter)
  8. 8. 8 Understanding vCenter Control Flow: C# Client Login VPXD vCenter server AD SSO VI Clients 1. Login request to vCenter (vpxd service) 2. vpxd contacts SSO for authentication 3. User is able to view inventory Note: vpxd no longer directly talks to AD
  9. 9. 9 Understanding vCenter Control Flow: A PowerOn Operation ESXi + HostD + VPXA STORAGE NETWORK VPXD DB App Server vSphere Web Clients Inv Serv vCenter server 1. PowerOn 2. To Vpxd 3. DRS + Admission Control 4. Issue Command To ESX. Report Status.5. Persist To DB 6a,b. Notify clients; Persist to Inv Svc Note: client is authenticated, so SSO not invoked during operation
  10. 10. 10 Agenda for vCenter Architectural Deep Dive  vCenter to ESX interactions  vCenter server internals  Database  Clients
  11. 11. 11 ESXi + HostD + VPXA STORAGE NETWORK VPXD vCenter service Architecture Deep Dive: vCenter to ESX Interactions 3 main interactions: 1. Command traffic (depends on load) 2. Update host status (host sync) 3. Statistics (bursty)
  12. 12. 12 vCenter-to-ESX Considerations: Latency and Throughput • Data transferred is typically small (KBs, not MBs) • Latency from VC-to-ESX has larger impact than throughput • Latency example: 4x diff (100ms vs. 500ms)  2x powerOn latency difference • Throughput example: 3x diff (512Kbps vs. 1.5Mbps)  0 powerOn latency difference • Other implications of high latency or low throughput • Impact on statistics • Slower stats collection • Slower real-time queries • Impact on browsing • Console slower • Host config slower • Other stuff should be same
  13. 13. 13 Architecture Deep Dive: vCenter Server Internals ESXi + HostD + VPXA STORAGE NETWORK VPXD DB App Server Health SRS vSphere Web Clients VI Clients Update Manager Converter AD API Clients … Java Inv Serv … vCenter server SSO PBSM vpxd • (Core business logic) • Sends tasks to appropriate hosts • Retrieves config changes from hosts • Pushes config updates to DB • Inserts stats into DB • Satisfies queries from clients  CPU/Memory important Log vctomcat
  14. 14. 14 Architecture Deep Dive: vCenter Server ESXi + HostD + VPXA STORAGE NETWORK VPXD DB App Server Health SRS vSphere Web Clients VI Clients Update Manager Converter AD API Clients … Java Inv Serv … vCenter server SSO PBSM Inv Serv (Inventory Service) • Cache of DB data • Stores extension data (SRM, PBSM) • Satisfies Web client queries • Helps with Linked Mode search • Contains embedded DB  IO crucial: install on different spindles from vpxd  Multi-threaded: CPU/mem important App Server (Web Client Server) • Satisfies web client requests • Forwards to Inv Serv, SSO, etc. • Spawns remote console service 1-1.5 CPUs should be enough Log vctomcat
  15. 15. 15 Architecture Deep Dive: vCenter Server ESXi + HostD + VPXA STORAGE NETWORK VPXD DB App Server Health SRS vSphere Web Clients VI Clients Update Manager Converter AD API Clients vctomcat … Java Inv Serv … vCenter server SSO PBSM SSO (Single-sign on) • C/C++ plus Java-based STS (secure-token service) • Handles authentication • Communicates with AD, etc. Vctomcat • Contains Health service • Contains SRS • Stats reporting service for overview perf charts • Retrieves data from DB • Contains EAM • ESX Agent Manager for manager VMs Log
  16. 16. 16 Architecture Deep Dive: vCenter Server ESXi + HostD + VPXA STORAGE NETWORK VPXD DB App Server Health SRS vSphere Web Clients VI Clients Update Manager Converter AD API Clients Tomcat … Java Inv Serv … vCenter server SSO PBSM Log (Log Browser service) • Allows log viewing in web client PBSM (Policy-based storage mgr) • Contains SMS + policy engines • Satisfies “Storage View” queries from clients • Every 2 hrs, queries DB and Inv Serv for most up-to-date data Can be CPU/Mem-intensive during queries Log
  17. 17. 17 vCenter Server Resource Usage vctomcat: SRS, EAM, Health, etc. Inventory Service Web Client App Server and remote console PBSM STS Log Browser
  18. 18. 18 vCenter Server Performance Considerations (1 of 2) Resource requirements • Many new services • Need sufficient CPU and Memory • May need to tune JVM heap sizes according to inventory size • Rules of thumb (Unofficial…please check documentation): • Small setups (< 1000 VMs): 2-4 vCPUs, 8-12GB • Medium setups (< 4000 VMs): 4-8 vCPUs, 12-24GB • Large setups (> 4000 VMs): 8-16 vCPUs, 24-32GB • Embedded database for Inventory Service • IO requirements higher (2-3K IOPs depending on load) • Place on its own spindles (separate from other services) • Consider SSDs
  19. 19. 19 vCenter Server Performance Considerations (2 of 2) Inventory Structure • Single datastore/datacenter/network can sometimes be vCenter bottleneck • Several smaller clusters may be better than 1 big cluster • Spreading hosts/networks/datastores across different datacenters relieves some bottlenecks
  20. 20. 20 Architecture Deep Dive: vCenter-to-Database Interactions VPXD DB VC talks to DB when… 1. Persisting statistics (5-minute intervals) 2. Persisting config changes (e.g., host syncs) higher when more tasks 3. Answering certain UI queries (e.g., cluster/datacenter charts, historical stats queries like past-day, past-week, etc.) 4.Persisting version information (for inv svc) ESXi + HostD + VPXA STORAGE NETWORK DB also performs these tasks: • Stats Rollups: VPX_HIST_STATX • 30 minutes, 2 hours, 1 day • Purging stats • when entities deleted • Purging events (if auto-purge configured) • Purging tasks (if auto-purge configured) • TopN computation • 10 min, 30 min, 2 hours, 1 day • Satisfying SMS data refresh for Storage views (every 2 hours)
  21. 21. 21 DB Performance Considerations (1 of 2) Latency to DB important (often more so than ESX-to-VC latency) • Almost everything involves the DB… • Stats persistence • Certain UI queries • Updating configuration information • Historical queries (events, alarms, task history) • … Recommendation: Place DB and vCenter close together Note: DB and vCenter on different hosts/VMs allows for independent sizing and tuning
  22. 22. 22 DB Performance Considerations (2 of 2)  DB traffic is write-mostly • Stats inserts and rollups, version updates, config changes, purges • Sufficient disk subsystem needed. If SSDs are an option, use them (2K IOPs)  Manage database disk growth • Majority of DB data is “SEAT” data (Stats, events, alarms, tasks): 80-85% (10s of GBs or more in big setups) • Inventory data: 10-15% of data (usually < 10GB for large inventories) • Choose stats levels wisely to avoid excessive growth • Utilize automatic purging of event/task tables if possible  Recompute DB stats on highly-volatile tables (at least once a day) • VPX_PROPERTY_BULLETIN • VPX_TOPN*
  23. 23. 23 Architecture Deep Dive: Client Interactions • C# VI client refreshes frequently • Induces load on vpxd  More clients, more load • Web client • Does not auto-refresh • Read requests satisfied by app server, not vpxd  Less load on vpxd • API clients • If listening to subset of inventory/properties, small load on vCenter • Limit of 2000 sessions to vCenter: includes all clients + remote console App server: Can put in same geo or on same server as Inv Svc VPXD DB App Server Health SRS vSphere Web Clients VI Clients Update Manager Converter AD API Clients Tomcat … Java Inv Serv … SSO PBSM Log
  24. 24. 24 Client Considerations  Clients add load • If you aren’t using a session, log out  Web Client App Server can go in same server as Inventory Service • Small resource footprint • Low latency to inventory service  For API clients, try to be a good citizen • Avoid frequent/expensive DB calls • Example: frequency createEventHistoryCollector with complex EventFilterSpec • Monitor specific inventory items or properties, not all entities and all properties • Log out when you are done (don’t waste sessions!)
  25. 25. 25 Client Notes: Simple Example of “Bad” Client in PowerCLI  Example of a good vs. bad client in PowerCLI  PowerCLI: • Simple to use, but involves client-side filtering • Example: Get-VM gets all VMs from server, filters list @ client  $vmList = Get-VM –name “vm1”,”vm2”,”vm3”,”vm4”  Good: 1 server call, client throws away all but vm1,vm2,vm3,vm4 $nameList = “vm1”,”vm2”,”vm3”,”vm4” foreach ($name in $nameList) { Get-VM $name } Bad: 4 server calls, gets all VMs 4 times…excess client/server work Also: Please log out when you are done! 
  26. 26. 26 vCenter Architecture: Summary (Whew!) ESXi + HostD + VPXA STORAGE NETWORK VPXD DB App Server Health SRS vSphere Web Clients VI Clients Update Manager Converter AD API Clients … Java Inv Serv … vCenter server SSO PBSM Log vctomcat
  27. 27. 27 Agenda  Introduction   vCenter Architectural Deep Dive   Common Questions  Multiple vCenter deployment strategies  Conclusion
  28. 28. 28 You Say n VMs/Hosts, but I Can Only Reach N. Why? How we set limits  Create a ‘large environment’  Attach clients, solutions, etc.  Run management operations (clones, powerOps, etc.)  Measure latency and throughput Why your setup may not reach our scale Different stats level Different device configuration of hosts/VMs (e.g., # of datastores) Different DB configuration (less memory, different recovery mode) Different latencies from VC-to-ESX or VC-to-DB Viewing Different Client Pages Accumulating events and tasks vs. purging them  Each might stress your vCenter/DB/network etc. more than ours
  29. 29. 29 How Many Concurrent Operations Can I Perform? (1 of 2)  vCenter hard limits • 640 concurrent operations before incoming requests are queued • 2000 concurrent sessions (incoming requests plus remote console sessions)  Per-host or per-datastore limits • A host can perform up to 8 provisioning operations at once (provisioning = clone, VMotion, relocate) • If host is source and destination, host can only do 4 operations at once • A datastore can perform up to 128 VMotions at once • A datastore can perform up to 8 Storage VMotions at once • Limits can be changed, but changes are not officially supported  Other limits • Datacenter/host/datastore synchronization at VC can limit concurrency
  30. 30. 30 vCenter Concurrency (2 of 2)  Clone VM from host A to host B  Each host can participate in 7 other provisioning operations  Clone VM from host A to host A  Host A can only participate in 6 more operations vCenter Host A VM 1 Host B VM 2 Cost to A: 1 Cost to B: 1 vCenter Host A VM 1 VM 2 Cost to A: 2 Do not use a single host as the source of all clones (i.e., spread out templates)  Better disk performance and better concurrency
  31. 31. 31 Why Should I Upgrade from VC5.0?  One big reason: In 5.1 and 5.5, stats tables are partitioned • Stats inserts more efficient (into a small partition at a time) • Rollups more efficient (plus, amount of data rolled up at once is throttled) • Stats data purging more efficient (simply truncating a partition) • vCenter can support higher stats levels for longer periods of time • Still recommend running higher stats levels (2-4) only for temporary troubleshooting Inserts Rollups Purge
  32. 32. 32 What Is the Real Dirt on Stats Levels?  Changing stats levels increases load on the database  Rough rules of thumb (not official VMware recommendations) • Level 1 stats: per-VM and per-host aggregate stats • Level 2 stats: additional per-VM/per-host stats 4x or more stats than Level 1 depending on configuration • Level 3 stats: per-instance stats 6x or more stats than Level 2 depending on configuration • Level 4 stats: additional rollup types 1.4x more stats than Level 3 depending on configuration • Use the stats calculator in vCenter • Try to use higher stats levels only for temporary debugging • If the stat you want is at the wrong level, let us know • Consider VCOps for more advanced stats functionality?
  33. 33. 33 Should I Distribute VC Services across VMs? (1 of 2)  You can distribute services (Inv Svc, SSO, vpxd, DB) to multiple VMs, but… • Better performance when vpxd and Inv Svc are co-located • Better performance when Web Client service and Inventory Service are close together • Better performance when vpxd and DB are close together
  34. 34. 34 Should I Distribute VC Services across VMs? (2 of 2)  Typical deployment pre-5.1 • VC and assorted services in 1 VM • VC DB in another VM  Will still work fine with VC 5.5  Another suggestion • Put all in 1 VM • Make sure VM has sufficient CPU/Memory/Disk/Network (follow best practices) • Put Inventory Service partition on separate spindles from vpxd and DB • Put DB partition on separate spindles • Advantage: looks ahead to future ‘single-VM’ appliance
  35. 35. 35 Why Are Cluster/Datacenter Charts Sometimes Slow?  These charts are computed on the fly  They require collection of data from hosts and VMs  A single slow host can hurt performance
  36. 36. 36 Agenda  Introduction   vCenter Architectural Deep Dive   Common Questions   Multiple vCenter deployment strategies  Conclusion
  37. 37. 37 When Should I Use Multiple vCenters? Considerations • Have you exceeded the single host limit? • Do you want one vCenter per geography? • Do you want one vCenter per organizational boundary? (finance, engineering, etc.) • Do you want a primary and secondary site (e.g., SRM)? • Do you prefer to manage smaller VCs?
  38. 38. 38 Single Site with Multiple vCenters ESX ESX ESX vCenter Server ESX ESX ESX vCenter Server AD VI Client API Client Important Considerations How do I decide how many vCenters I need? (Consider vCenter limits, Organizational boundaries) Do I want a single view of inventory managed by all vCenters? How do I synchronize roles/permissions across vCenters? VI ClientVI Client API ClientAPI Client Site A Yes? Consider “linked mode” …
  39. 39. 39 Linked Mode  Single pane of glass from UI for inventory data  Search across VC instances  Unified roles and permissions via AD
  40. 40. 40 Linked Mode Architecture GUI Linked Mode Linked Mode vCenter AD VC DB ADAM IS vCenter VC DB ADAM IS vCenter VC DB ADAM IS Role ARole A Role A
  41. 41. 41 Multiple vCenters in a Single Site in Linked Mode VI Client API ClientVI ClientVI Client API ClientAPI Client ESX ESX ESX vCenter Server ESX ESX ESX vCenter Server ESX ESX ESX vCenter Server AD Site A Important Considerations: • At most 10 vCenters can be linked together • Does not work on vCenter Server Appliance (ADAM Replication) • Cross-vCenter operations not available • API not linked mode aware
  42. 42. 42 Linked Mode and Single Sign-On Considerations  Linked Mode • Should I use linked mode across multiple sites? • Business units that have computing needs across data centers • What impact does bandwidth have on cross site linked mode? • Except for query federation, linked mode sites only communicate via ADAM Linked mode adds minimal cross-site network overhead over multi-site without linked mode Bandwidth tradeoffs same as for multi-site vCenters without linked mode  Single Sign-On • Extend the vSphere authentication domain across sites • Use Domain accounts for permissions instead of Local OS • Define replication partners for WAN replication
  43. 43. 43 Agenda  Introduction   vCenter Architectural Deep Dive   Common Questions   Multiple vCenter deployment strategies   Conclusion
  44. 44. 44 Looking Ahead (No Timelines…)  Many things, but a few main ones: • Single VM vCenter appliance that can support increasing scale and federation • Improved performance and scalability • Operations across VC (like cross-VC VMotion)
  45. 45. 45 Conclusion  Single vCenter…some key takeaways • Services can be placed in the same VM • IO performance is critical for vCenter and inventory service • DB provisioning is critical • VC-to-DB latency is important  Multiple vCenters…Why? • Exceeding single vCenter limits • Organizational boundaries • Security and compliance • Local/remote administration  Should I use linked mode? • Single pane of glass from UI? Yes (but also possible with just Web Client…) • Synchronized roles? Yes
  46. 46. 46 Performance Community Resources Performance Technology Pages • http://www.vmware.com/technical-resources/performance/resources.html Technical Marketing Blog • http://blogs.vmware.com/vsphere/performance/ Performance Engineering Blog VROOM! • http://blogs.vmware.com/performance Performance Community Forum • http://communities.vmware.com/community/vmtn/general/performance Virtualizing Business Critical Applications • http://www.vmware.com/solutions/business-critical-apps/
  47. 47. 47 Performance Technical Resources Performance Technical Papers • http://www.vmware.com/resources/techresources/cat/91,96 Performance Best Practices • http://www.youtube.com/watch?v=tHL6Vu3HoSA • http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.0.pdf • http://www.vmware.com/pdf/Perf_Best_Practices_vSphere4.1.pdf • http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf • http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.1.pdf  Troubleshooting Performance Related Problems in vSphere Environments • http://communities.vmware.com/docs/DOC-14905 (vSphere 4.1) • http://communities.vmware.com/docs/DOC-19166 (vSphere 5) • http://communities.vmware.com/docs/DOC-23094 (vSphere 5.x with vCOps)
  48. 48. 48 Don’t miss: vCenter of the Universe – Session # VSVC5234 Monster Virtual Machines – Session # VSVC4811 Network Speed Ahead – Session # VSVC5596 Storage in a Flash – Session # VSVC5603 Big Data: Virtualized SAP HANA Performance, Scalability and Practices – Session # VAPP5591
  49. 49. 49 Other VMware Activities Related to This Session  HOL: HOL-SDC-1304 vSphere Performance Optimization  Group Discussions: VSVC1001-GD Performance with Mark Achtemichuk
  50. 50. THANK YOU
  51. 51. Extreme Performance Series: vCenter of the Universe Justin King, VMware Ravi Soundararajan, VMware VSVC5234 #VSVC5234

×