VDI DESIGN GUIDEHypervisor | Server | Storage
What Would Dan Do?Blog: http://www.danbrinkmann.comTwitter: @dbrinkmannVMware vExpert 2012
Decision Points• Hypervisor• Server• Storage
Before You Go Further•   Have a business reason•   Determine user groups and applications•   Determine requirements•   Pil...
Design•   VM’s per host•   Hosts per cluster•   Management infrastructure limits•   Storage IO / capacity limits
Hypervisor ChoicesOptions• VMware vSphere      85% - 90%• Citrix XenServer    10% - 15%      My experience• Microsoft Hype...
Hypervisor ChoicesXenServer• Storage stack   • NFS is the best option   • LVM over iSCSI or FC is confusing for “Windows” ...
Hypervisor ChoicesMicrosoft Hyper-V• 2008 R2 not relevant• Hyper-V 2012   •   Offloaded data transfer   •   No more redire...
Hypervisor ChoicesWhich one should I choose?• vSphere   •   Most features   •   Broadest support among vendors   •   Large...
Should I Virtualize vCenter?• Consider an infrastructure cluster  Management VM’s               Virtual Desktops     VMwar...
Should I virtualize vCenter?• Use VM-Host affinity rules
Servers Determine a Lot• Storage options• Density – cpu/memory• High availability design
First Some Virtual Desktop Fallacies•   vCPU count•   vCPU to pCore overcommit•   Memory requirements•   Storage IO requir...
Servers - Rackmount vs Blades• Supermodel or the girl next door
Servers - Rackmount vs BladesFailure domain• Blade chassis vs individual rackmount server• Design for N+1 blade chassis• B...
Servers - Rackmount vs BladesLocal disk• Blade chassis local disk limited and/or expensive• PCI-Express cards not always a...
Servers – CPUHypervisor (vSphere) CPU scheduler“When making scheduling decisions, the ratio of theconsumed CPU resources t...
Servers - CPUCompute with a physical PC                   OS/Apps/Profile                        CPU 1
Servers - CPUCompute in RDSH / XenApp          OS/Apps/Prof   OS/Apps/Prof   OS/Apps/Prof   OS/Apps/Prof          OS/Apps/...
Servers - CPUCompute in VDI                 CPU 1   CPU 2
Servers - CPUCPU utilization – not enough
Servers - CPUThis is proper CPU monitoringDisplay Metric    Threshold   Explanation                              Overprovi...
Servers – 2 Socket vs 4 SocketFailure domain• Smaller host = less desktops affected• Smaller host might also mean more clu...
Servers – 2 Socket vs 4 SocketLocal disk options (rackmount)• 2 socket servers 8-26 bays• 4 socket servers 8-16 bays16 dri...
Servers – 2 Socket vs 4 Socket$$$ Price $$$• Historically 4 socket servers have not been a  linear price increase from 2 s...
Servers – 2 Socket vs 4 SocketSample pricing• 2 socket Intel E5-26xx 8c, 384GB RAM ~$15,500• 4 socket Intel E5-46xx 8c, 76...
Servers - CPU• More cores is more better (E5 8c, E7 10c)• AMD vs Intel
Servers - Memory•   Buy a lot of it!•   Do not run out!•   16GB DIMM size is common•   24 DIMM slots means 384GB
Servers – Hidden Memory Requirements• Memory overhead  •   Number of vCPU  •   Amount of RAM  •   Amount of vRAM  •   3D s...
StorageVM’s per datastore / LUN• VAAI (ATS)   • 4.1 – 2, 5.0 U1 all 8 http://blogs.vmware.com/vsphere/2012/05/vmfs-     lo...
StorageLocal disk•   Will most likely impact server hardware decision•   Small failure domain•   Spinning disk limitations...
StorageiSCSI vs Fibre Channel vs NFS• iSCSI vs Fibre Channel• NFS   • Best option XenServer   • Cluster size options in vS...
StorageHidden capacity requirements•   vswp file (equal to memory size minus reservation)•   vswp file for memory overhead...
Storage – Monitoring• IOPs - #1 reason for VDI failure• Latency
Summary• Do not choose hardware first
VDI Design Guide
Upcoming SlideShare
Loading in...5
×

VDI Design Guide

5,507

Published on

Hypervisor, Server, Storage design considerations for VDI

Published in: Technology
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,507
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide
  • 16 drives * 175IOPs is 2800 IOPs
  • Tranparent Page sharingBallooningHypervisor swappingMemory compressionSwap to host cache (SSD)
  • Atomic test and set
  • VDI Design Guide

    1. 1. VDI DESIGN GUIDEHypervisor | Server | Storage
    2. 2. What Would Dan Do?Blog: http://www.danbrinkmann.comTwitter: @dbrinkmannVMware vExpert 2012
    3. 3. Decision Points• Hypervisor• Server• Storage
    4. 4. Before You Go Further• Have a business reason• Determine user groups and applications• Determine requirements• Pilot• Gather data
    5. 5. Design• VM’s per host• Hosts per cluster• Management infrastructure limits• Storage IO / capacity limits
    6. 6. Hypervisor ChoicesOptions• VMware vSphere 85% - 90%• Citrix XenServer 10% - 15% My experience• Microsoft Hyper-V ~0%• VMware View = vSphere hypervisor
    7. 7. Hypervisor ChoicesXenServer• Storage stack • NFS is the best option • LVM over iSCSI or FC is confusing for “Windows” people • Storagelink…/sigh• Lack of 3rd party integration• Lack of skilled engineers in the market• Management built in to hypervisor (XenCenter)
    8. 8. Hypervisor ChoicesMicrosoft Hyper-V• 2008 R2 not relevant• Hyper-V 2012 • Offloaded data transfer • No more redirected mode • Additional storage options • CSV cache http://blogs.msdn.com/b/clustering/archive/2012/03/22/10286676.aspx • 3rd party integrations • Network teaming support
    9. 9. Hypervisor ChoicesWhich one should I choose?• vSphere • Most features • Broadest support among vendors • Largest base of skilled engineers • Most number of 3rd party integrations • Highest cost• Translate: least amount of brain damage (today)
    10. 10. Should I Virtualize vCenter?• Consider an infrastructure cluster Management VM’s Virtual Desktops VMware vSphere VMware vSphere Servers Servers
    11. 11. Should I virtualize vCenter?• Use VM-Host affinity rules
    12. 12. Servers Determine a Lot• Storage options• Density – cpu/memory• High availability design
    13. 13. First Some Virtual Desktop Fallacies• vCPU count• vCPU to pCore overcommit• Memory requirements• Storage IO requirements• Don’t believe vendor estimates
    14. 14. Servers - Rackmount vs Blades• Supermodel or the girl next door
    15. 15. Servers - Rackmount vs BladesFailure domain• Blade chassis vs individual rackmount server• Design for N+1 blade chassis• Blade chassis failures I’ve seen • Backplane failure • Integrated networking (interconnect) failure
    16. 16. Servers - Rackmount vs BladesLocal disk• Blade chassis local disk limited and/or expensive• PCI-Express cards not always available as mezzanine option• Desktop persistence
    17. 17. Servers – CPUHypervisor (vSphere) CPU scheduler“When making scheduling decisions, the ratio of theconsumed CPU resources to the entitlement is usedas the priority of the world. If there is a world that hasconsumed less than its entitlement, the world isconsidered high priority and will likely be chosen torun next.”http://www.vmware.com/resources/techresources/10131
    18. 18. Servers - CPUCompute with a physical PC OS/Apps/Profile CPU 1
    19. 19. Servers - CPUCompute in RDSH / XenApp OS/Apps/Prof OS/Apps/Prof OS/Apps/Prof OS/Apps/Prof OS/Apps/Pr ile OS/Apps/Pr ile OS/Apps/Pr ile OS/Apps/Pr ile ofile ofile ofile ofile CPU 1 CPU 2
    20. 20. Servers - CPUCompute in VDI CPU 1 CPU 2
    21. 21. Servers - CPUCPU utilization – not enough
    22. 22. Servers - CPUThis is proper CPU monitoringDisplay Metric Threshold Explanation Overprovisioning of vCPUs, excessive usage of vSMP or a limit(check CPU %RDY 10 %MLMTD) has been set. Excessive usage of vSMP. Decrease amount of vCPUs for this CPU %CSTP 3 particular VM. This should lead to increased scheduling opportunities. The percentage of time spent by system services on behalf of the CPU %SYS 20 world. Most likely caused by high IO VM. Check other metrics and VM for possible root cause The percentage of time the vCPU was ready to run but deliberately wasn’t scheduled because that would violate the “CPU limit” CPU %MLMTD 0 settings. If larger than 0 the world is being throttled due to the limit on CPU. VM waiting on swapped pages to be read from disk. Possible cause: CPU %SWPWT 5 Memory overcommitment.
    23. 23. Servers – 2 Socket vs 4 SocketFailure domain• Smaller host = less desktops affected• Smaller host might also mean more clusters
    24. 24. Servers – 2 Socket vs 4 SocketLocal disk options (rackmount)• 2 socket servers 8-26 bays• 4 socket servers 8-16 bays16 drives * 175 IOPs = 2800 IOPs (not RAID adj)
    25. 25. Servers – 2 Socket vs 4 Socket$$$ Price $$$• Historically 4 socket servers have not been a linear price increase from 2 socket servers…so is that still true today?
    26. 26. Servers – 2 Socket vs 4 SocketSample pricing• 2 socket Intel E5-26xx 8c, 384GB RAM ~$15,500• 4 socket Intel E5-46xx 8c, 768GB RAM ~$32,000- $36,500~3% - 18% premium
    27. 27. Servers - CPU• More cores is more better (E5 8c, E7 10c)• AMD vs Intel
    28. 28. Servers - Memory• Buy a lot of it!• Do not run out!• 16GB DIMM size is common• 24 DIMM slots means 384GB
    29. 29. Servers – Hidden Memory Requirements• Memory overhead • Number of vCPU • Amount of RAM • Amount of vRAM • 3D support• Memory pressure • http://kb.vmware.com/selfservice/microsites/search.do?language=en _US&cmd=displayKC&externalId=1033687 • minFreePct 2%-6% • 6% of 384GB is 23GB• Hypervisor requirements• Storage caching (CBRC, CSV cache)
    30. 30. StorageVM’s per datastore / LUN• VAAI (ATS) • 4.1 – 2, 5.0 U1 all 8 http://blogs.vmware.com/vsphere/2012/05/vmfs- locking-uncovered.html• <140 per datastore
    31. 31. StorageLocal disk• Will most likely impact server hardware decision• Small failure domain• Spinning disk limitations• SSD or PCI-E NAND-flash options• Non-persistent virtual machines
    32. 32. StorageiSCSI vs Fibre Channel vs NFS• iSCSI vs Fibre Channel• NFS • Best option XenServer • Cluster size options in vSphere 5 U1• Don’t make choice for “performance” reasons
    33. 33. StorageHidden capacity requirements• vswp file (equal to memory size minus reservation)• vswp file for memory overhead• pagefile• Identity disk (XenDesktop)• Differencing disk
    34. 34. Storage – Monitoring• IOPs - #1 reason for VDI failure• Latency
    35. 35. Summary• Do not choose hardware first

    ×