Indianapolis VMUG Next Generation Best Practices for Storage and VMware Brian Lewis vSpecialist – Central US [email_address]
The  “Great” Protocol Debate <ul><li>Every protocol can Be Highly Available, and  generally , every protocol can meet a br...
Key things to know – “A – F” Best Practices circa 2010/2011
“ A” Best Practices circa 2010/2011 Leverage Key Documentation
Key Papers <ul><li>Key VMware Docs: </li></ul><ul><li>Fibre Channel SAN Configuration Guide </li></ul><ul><li>iSCSI SAN Co...
<ul><li>Use your storage partners docs: </li></ul><ul><li>Each Array is very different, arrays vary more vendor to vendor ...
“ B” Best Practices circa 2010/2011 Configure Multipathing
Understanding the vSphere Pluggable Storage Architecture (PSA)
What ’s “out of the box” in vSphere 4.1? [root@esxi ~]# vmware -v VMware ESX 4.1.0 build-260247   [root@esxi ~]# esxcli nm...
What ’s “out of the box” in vSphere? <ul><li>PSPs: </li></ul><ul><li>Fixed (Default for Active-Active LUN ownership models...
What ’s “out of the box” in vSphere? HOWTO – setting PSP for a specific device (can override default selected by SATP dete...
Or the New Way…
Changing Round Robin IOOperationLimit esxcli nmp roundrobin setconfig --device <device UID> –iops check with your storage ...
Effect of different RR IOOperationLimit settings NOTE:  This is with a SINGLE LUN. This is the case where the larger IOOpe...
What is Asymmetric Logical Unit (ALUA)? <ul><li>Many storage arrays have Active/Passive LUN ownership </li></ul><ul><li>Al...
What is Asymmetric Logical Unit (ALUA)? <ul><ul><li>ALUA Allows for paths to be profiled </li></ul></ul><ul><ul><li>Active...
Understanding MPIO MPIO is based on  “initiator-target” sessions – not “links”
MPIO Exceptions – Windows Clusters <ul><li>Among a long list of  “not supported” things: </li></ul><ul><li>NO Clustering o...
PowerPath – a Multipathing Plugin (MPP) <ul><li>Simple Storage manageability </li></ul><ul><ul><li>Simple Provisioning =  ...
NFS Considerations
General NFS Best Practices <ul><li>Start with Vendor Best Practices: </li></ul><ul><ul><li>EMC Celerra H5536 & NetApp TR-3...
General NFS Best Practices - Timeouts <ul><li>Configure the following on each ESX server (automated by vCenter plugins): <...
General NFS Best Practices –  Traditional Ethernet switches <ul><li>Mostly seen with older 1GbE switching platforms </li><...
General NFS Best Practices –  Multi-Switch Link Aggregation <ul><li>Allows two physical switches to operate as a  single l...
General NFS Best Practices – HA and Scaling 10GbE? One VMKernel port & IP subnet Support multi-switch Link aggr? Use multi...
<ul><li>What is an Ethernet Jumbo Frame? </li></ul><ul><ul><li>Ethernet frames with more than 1500 bytes of payload (9000 ...
iSCSI & NFS caveat when used together <ul><li>Remember – iSCSI and NFS network HA models = DIFFERENT </li></ul><ul><ul><li...
Summary of  “Setup Multipathing Right” <ul><li>VMFS/RDMs </li></ul><ul><ul><li>Round Robin policy for NMP is default BP on...
“ C” Best Practices circa 2010/2011 Track Alignment
“ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisf...
“ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisf...
“ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisf...
“ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisf...
Alignment – Best Solution:  “Align VMs” <ul><li>VMware, Microsoft, Citrix, EMC all agree,  align partitions </li></ul><ul>...
Alignment –  “Fixing after the fact” <ul><li>VMFS is misaligned </li></ul><ul><ul><li>Occurs  If you created the VMFS via ...
“ D” Best Practices circa 2010/2011 Utilize free vCenter plugins and VAAI
“ Leverage Free Plugins and VAAI”  <ul><li>Use Vendor plug-ins for VMware vSphere </li></ul><ul><ul><li>All provide better...
VAAI <ul><li>vStorage APIs for Array Integration </li></ul>Block Zero What: 10x less IO for common tasks How: Eliminating ...
“ What?  VAAI isn’t working….” <ul><li>How do I know? </li></ul><ul><ul><li>Testing Storage VMotion/Cloning with no-offloa...
“ E” Best Practices circa 2010/2011 Keep it Simple
“ Keep it Simple on Layout” <ul><li>Use VMFS and NFS together – no reason not to </li></ul><ul><li>Strongly consider 10GbE...
“ F” Best Practices circa 2010/2011 Use SIOC (if you can)
“ Use SIOC if you can” <ul><li>“ If you can” equals: </li></ul><ul><ul><li>vSphere 4.1, Enterprise Plus </li></ul></ul><ul...
Best Practices circa 2010/2011 General ‘Gotchas’
“ My storage team gives me tiny devices” <ul><li>How do I know? </li></ul><ul><ul><li>“ My storage team can only give us  ...
“ My NFS based VM is impacted following a storage reboot or failover” <ul><li>How do I know? </li></ul><ul><li>VM freezes ...
Best Practices circa 2010/2011 When do the best practices  not  apply?
5 Exceptions to the rules <ul><li>Create  “planned datastore designs” (rather than big pools and correct after the fact) f...
THANK YOU – AND COME & PLAY! <ul><li>Win a 320GB eGo Drive at the booth! </li></ul>Lab 1: EMC vCenter Plugin Tour Lab 2: V...
Upcoming SlideShare
Loading in...5
×

2011 q1-indy-vmug

1,410

Published on

vSpecialst Brian Lewis presents storage best practices to the Indy VMUG on 01-27-2011.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,410
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Title Month Year
  • Title Month Year
  • Title Month Year
  • Title Month Year
  • Hyper-consolidation of virtual machines can lead to complex storage architectures. Mapping dozens of LUNs accessed by 100 ’s of VMs to channels is a laborious and complex job. Further, as virtual machines move around in the cluster, the IO loads on the channel can change significantly. EMC ’s industry-leading PowerPath significantly reduces the effort required to set up the SAN environment. PowerPath lets you treat the connections between the ESX servers and the EMC storage as a pool. With PowerPath, you don’t have to try to figure out which LUNs should share which channels. PowerPath uses all available paths to access all devices. Dynamic load balancing algorithms will continuously adjust IO routing to provide the best overall performance. So when DRS kicks in and moves VMs around in the cluster, PowerPath will automatically adjust how the I/Os transit the SAN, providing predictable performance. Some VMs and applications are more important and have higher or more critical disk IO workloads. PowerPath provides the ability to set priorities on the most important LUNs to help ensure that the critical applications are getting the data they need to run smoothly. Coupled with DRS and array based QOS (NQM or Priority Manager), you have end to end QOS control of your storage environment. PowerPath also provides channel fault protection, so the loss of an HBA, cable, switch, or array connection will be invisible to ESX or the application. Animation Control: Slide comes up with one row of VMs. &lt;click&gt; causes IOs to begin going to/from the storage. Story: With a few VMs, mapping IO to drives and channels is fairly simple. But when you start adding more VMs more IO load is put on the SAN. &lt;click&gt; causes more VMs to appear with more IO With hyperconsolidation, you can have 100 ’s of independent VMs running within the environment. The VMware admin can even put on that IO intensive app ( look in the second row, fourth from the right) that disrupts IO from other apps in the environment. Setting this up to ensure that all of the VMs get the IO response time they need is very difficult. Then add VMotion, DRS, HA, and any assumption you have about which IO streams will be sharing which channels are invalidated. &lt;click&gt; PowerPath is installed on the ESX servers, then all the paths are masked. PowerPath will manage all of this complexity, constantly adjusting the IO path usage to the changes in IO loads coming from the VMs. PowerPath lets you ignore all of the complexity of what goes where. Simply assign all devices to all paths, and turn PowerPath loose to do it ’s thing, optimizing the overall IO performance for the ESX environment. If need be, you can provide additional QOS management for the most important application by managing LUN and path prioritization.
  • 2011 q1-indy-vmug

    1. 1. Indianapolis VMUG Next Generation Best Practices for Storage and VMware Brian Lewis vSpecialist – Central US [email_address]
    2. 2. The “Great” Protocol Debate <ul><li>Every protocol can Be Highly Available, and generally , every protocol can meet a broad performance band </li></ul><ul><li>Each protocol has different configuration considerations </li></ul><ul><li>Each Protocol has a VMware “super-power”, and also a “kryponite” </li></ul><ul><li>In vSphere, there is core feature equality across protocols </li></ul>Conclusion: there is no debate – pick what works for you! The best flexibility comes from a combination of VMFS and NFS
    3. 3. Key things to know – “A – F” Best Practices circa 2010/2011
    4. 4. “ A” Best Practices circa 2010/2011 Leverage Key Documentation
    5. 5. Key Papers <ul><li>Key VMware Docs: </li></ul><ul><li>Fibre Channel SAN Configuration Guide </li></ul><ul><li>iSCSI SAN Configuration Guide </li></ul><ul><li>Storage/SAN Compatibility Guide </li></ul><ul><li>… Understand VMware Storage Taxonomy: </li></ul><ul><li>Active/Active (LUN ownership) </li></ul><ul><li>Active/Passive (LUN ownership) </li></ul><ul><li>Virtual Port (iSCSI only) </li></ul>Highly Recommended Reading:
    6. 6. <ul><li>Use your storage partners docs: </li></ul><ul><li>Each Array is very different, arrays vary more vendor to vendor than servers do </li></ul><ul><li>Find, read, and stay current on your array ’s Best Practices Doc – most are excellent </li></ul><ul><li>Even if you ’re NOT the storage team, read them – it will help you </li></ul>Techbooks: http://www.emc.com/collateral/hardware/solution-overview/h2529-vmware-esx-svr-w-symmetrix-wp-ldv.pdf http://www.emc.com/collateral/hardware/technical-documentation/h5536-vmware-esx-srvr-using-celerra-stor-sys-wp.pdf http://www.emc.com/collateral/software/solution-overview/h2197-vmware-esx-clariion-stor-syst-ldv.pdf Highly Recommended Reading:
    7. 7. “ B” Best Practices circa 2010/2011 Configure Multipathing
    8. 8. Understanding the vSphere Pluggable Storage Architecture (PSA)
    9. 9. What ’s “out of the box” in vSphere 4.1? [root@esxi ~]# vmware -v VMware ESX 4.1.0 build-260247   [root@esxi ~]# esxcli nmp satp list Name                 Default PSP       Description VMW_SATP_SYMM        VMW_PSP_FIXED     Placeholder (plugin not loaded) VMW_SATP_SVC         VMW_PSP_FIXED     Placeholder (plugin not loaded) VMW_SATP_MSA         VMW_PSP_MRU       Placeholder (plugin not loaded) VMW_SATP_LSI         VMW_PSP_MRU       Placeholder (plugin not loaded) VMW_SATP_INV         VMW_PSP_FIXED     Placeholder (plugin not loaded) VMW_SATP_EVA         VMW_PSP_FIXED     Placeholder (plugin not loaded) VMW_SATP_EQL         VMW_PSP_FIXED     Placeholder (plugin not loaded) VMW_SATP_DEFAULT_AP  VMW_PSP_MRU       Placeholder (plugin not loaded) VMW_SATP_ALUA_CX     VMW_PSP_FIXED_AP  Placeholder (plugin not loaded) VMW_SATP_CX          VMW_PSP_MRU       Supports EMC CX that do not use the ALUA protocol VMW_SATP_ALUA        VMW_PSP_RR        Supports non-specific arrays that use the ALUA protocol VMW_SATP_DEFAULT_AA  VMW_PSP_FIXED     Supports non-specific active/active arrays VMW_SATP_LOCAL       VMW_PSP_FIXED     Supports direct attached devices
    10. 10. What ’s “out of the box” in vSphere? <ul><li>PSPs: </li></ul><ul><li>Fixed (Default for Active-Active LUN ownership models) </li></ul><ul><ul><li>All IO goes down preferred path, reverts to preferred path after original path restore </li></ul></ul><ul><li>MRU (Default for Active-Passive LUN ownership models) </li></ul><ul><ul><li>All IO goes down active path, stays after original path restore </li></ul></ul><ul><li>Round Robin </li></ul><ul><ul><li>n IO operations goes down active path then rotate (default is 1000) </li></ul></ul>
    11. 11. What ’s “out of the box” in vSphere? HOWTO – setting PSP for a specific device (can override default selected by SATP detected ARRAYID): esxcli nmp device setpolicy --device <device UID> --psp VMW_PSP_RR (check with your vendor first!)
    12. 12. Or the New Way…
    13. 13. Changing Round Robin IOOperationLimit esxcli nmp roundrobin setconfig --device <device UID> –iops check with your storage vendor first! This setting can cause problems on arrays. Has been validated ok, but not necessary in most cases
    14. 14. Effect of different RR IOOperationLimit settings NOTE: This is with a SINGLE LUN. This is the case where the larger IOOperationLimit default is the worst In a real-world environment – lots of LUNs and VMs results in decent overall loadbalancing Recommendation – if you can, stick with the default
    15. 15. What is Asymmetric Logical Unit (ALUA)? <ul><li>Many storage arrays have Active/Passive LUN ownership </li></ul><ul><li>All paths show in the vSphere Client as: </li></ul><ul><ul><li>Active (can be used for I/O) </li></ul></ul><ul><ul><li>I/O is accepted on all ports </li></ul></ul><ul><ul><li>All I/O for a LUN is serviced on its owning storage processor </li></ul></ul><ul><ul><li>In reality some paths are preferred over others </li></ul></ul><ul><ul><li>Enter ALUA to solve this issue </li></ul></ul><ul><ul><li>Supported introduced in vSphere 4.0 </li></ul></ul>SP A SP B LUN
    16. 16. What is Asymmetric Logical Unit (ALUA)? <ul><ul><li>ALUA Allows for paths to be profiled </li></ul></ul><ul><ul><li>Active (can be used for I/O) </li></ul></ul><ul><ul><li>Active (non-optimized – not normally used for I/O) </li></ul></ul><ul><ul><li>Standby </li></ul></ul><ul><ul><li>Dead </li></ul></ul><ul><li>Ensures optimal path selection/usage by vSphere PSP and 3 rd Party MPPs </li></ul><ul><ul><li>Supports Fixed, MRU, & RR PSP </li></ul></ul><ul><ul><li>Supports EMC PowerPath/VE </li></ul></ul><ul><li>ALUA is not supported in ESX 3.5 </li></ul>SP A SP B LUN
    17. 17. Understanding MPIO MPIO is based on “initiator-target” sessions – not “links”
    18. 18. MPIO Exceptions – Windows Clusters <ul><li>Among a long list of “not supported” things: </li></ul><ul><li>NO Clustering on NFS datastores </li></ul><ul><li>No Clustering on iSCSI, FCoE (unless using PP/VE) </li></ul><ul><li>No round-robin with native multipathing (unless using PP/VE) </li></ul><ul><li>NO Mixed environments, such as configurations where one cluster node is running a different version of ESX/ESXi than another cluster node. </li></ul><ul><li>NO Use of MSCS in conjunction with VMware Fault Tolerance. </li></ul><ul><li>NO Migration with vMotion of clustered virtual machines. </li></ul><ul><li>NO N-Port ID Virtualization (NPIV) </li></ul><ul><li>You must use hardware version 7 with ESX/ESXi 4.1 </li></ul>
    19. 19. PowerPath – a Multipathing Plugin (MPP) <ul><li>Simple Storage manageability </li></ul><ul><ul><li>Simple Provisioning = “Pool of Connectivity” </li></ul></ul><ul><ul><li>Predictable and consistent </li></ul></ul><ul><ul><li>Optimize server, storage, and data-path utilization </li></ul></ul><ul><li>Performance and Scale </li></ul><ul><ul><li>Tune infrastructure performance, LUN/Path Prioritization </li></ul></ul><ul><ul><li>Predictive Array Specific Load Balancing Algorithms </li></ul></ul><ul><ul><li>Automatic HBA, Path, and storage processor fault recovery </li></ul></ul><ul><li>Other 3rd party MPPs: </li></ul><ul><ul><li>Dell/Equalogic PSP </li></ul></ul><ul><ul><ul><li>Uses a “least deep queue” algorithm rather than basic round robin </li></ul></ul></ul><ul><ul><ul><li>Can redirect IO to different peer storage nodes </li></ul></ul></ul>STORAGE OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP OS APP PowerPath PowerPath PowerPath PowerPath Shared Storage
    20. 20. NFS Considerations
    21. 21. General NFS Best Practices <ul><li>Start with Vendor Best Practices: </li></ul><ul><ul><li>EMC Celerra H5536 & NetApp TR-3749 </li></ul></ul><ul><ul><li>While these are constantly being updated, at any given time, they are authoritative </li></ul></ul><ul><li>Use the EMC & NetApp vCenter plug-ins, automates best practices </li></ul><ul><li>Use Multiple NFS datastores & 10GbE </li></ul><ul><ul><li>1GbE requires more complexity to address I/O scaling due to one data session per connection with NFSv3 </li></ul></ul>
    22. 22. General NFS Best Practices - Timeouts <ul><li>Configure the following on each ESX server (automated by vCenter plugins): </li></ul><ul><ul><li>  NFS.HeartbeatFrequency = 12 </li></ul></ul><ul><ul><li>NFS.HeartbeatTimeout = 5 </li></ul></ul><ul><ul><li>NFS.HeartbeatMaxFailures = 10 </li></ul></ul><ul><li>Increase Guest OS time-out values to match </li></ul><ul><ul><li>Back up your Windows registry. </li></ul></ul><ul><ul><li>Select Start>Run, regedit </li></ul></ul><ul><ul><li>In the left‐panel hierarchy view, double‐click HKEY_LOCAL_MACHINE> System> CurrentControlSet> Services> Disk. </li></ul></ul><ul><ul><li>Select the TimeOutValue and set the data value to 125 (decimal). </li></ul></ul><ul><ul><li>Note: this is not reset when VMtools are updated </li></ul></ul><ul><li>Increase Net.TcpIpHeapSize (follow vendor recommendation) </li></ul>
    23. 23. General NFS Best Practices – Traditional Ethernet switches <ul><li>Mostly seen with older 1GbE switching platforms </li></ul><ul><ul><li>Each switch operates independently </li></ul></ul><ul><li>More complex network design </li></ul><ul><ul><li>Depends on routing, requires two (or more) IP subnets for datastore traffic </li></ul></ul><ul><ul><li>Multiple Ethernet options based on Etherchannel capabilities and preferences </li></ul></ul><ul><ul><li>Some links may be passive standby links </li></ul></ul>
    24. 24. General NFS Best Practices – Multi-Switch Link Aggregation <ul><li>Allows two physical switches to operate as a single logical fabric </li></ul><ul><li>Much simpler network design </li></ul><ul><ul><li>Single IP subnet </li></ul></ul><ul><ul><li>Provides multiple active connections to each storage controller </li></ul></ul><ul><ul><li>Easily scales to more connections by adding NICs and aliases </li></ul></ul><ul><ul><li>Storage controller connection load balancing is automatically managed by the EtherChannel IP load-balancing policy </li></ul></ul>
    25. 25. General NFS Best Practices – HA and Scaling 10GbE? One VMKernel port & IP subnet Support multi-switch Link aggr? Use multiple links with IP hash load balancing on the NFS client (ESX) Use multiple links with IP hash load balancing on The NFS server (array) Storage needs multiple sequential IP addresses Use multiple VMKernel Ports & IP subnets Use ESX routing table Storage needs multiple sequential IP addresses Yes No Yes
    26. 26. <ul><li>What is an Ethernet Jumbo Frame? </li></ul><ul><ul><li>Ethernet frames with more than 1500 bytes of payload (9000 is common) </li></ul></ul><ul><ul><li>Commonly ‘thought of’ as having better performance due to greater payload per packet / reduction of packets </li></ul></ul><ul><ul><li>Should I use Jumbo Frames? </li></ul></ul><ul><ul><li>Supported by all major storage vendors & VMware </li></ul></ul><ul><ul><li>Adds complexity & performance gains are marginal with common block sizes </li></ul></ul><ul><ul><li>FCoE uses MTU of 2240 which is auto-configured via switch and CAN handshake </li></ul></ul><ul><ul><ul><li>All IP traffic transfers at default MTU size </li></ul></ul></ul><ul><li>Stick with the defaults when you can </li></ul>iSCSI & NFS – Ethernet Jumbo Frames
    27. 27. iSCSI & NFS caveat when used together <ul><li>Remember – iSCSI and NFS network HA models = DIFFERENT </li></ul><ul><ul><li>iSCSI uses vmknics with no Ethernet failover – using MPIO instead </li></ul></ul><ul><ul><li>NFS client relies on vmknics using link aggregation/Ethernet failover </li></ul></ul><ul><ul><li>NFS relies on host routing table </li></ul></ul><ul><ul><li>NFS traffic will use iSCSI vmknic and results in links without redundancy </li></ul></ul><ul><ul><li>Use of multiple session iSCSI with NFS is not supported by NetApp </li></ul></ul><ul><ul><li>EMC supports, but best practice is to have separate subnets, virtual interfaces </li></ul></ul>
    28. 28. Summary of “Setup Multipathing Right” <ul><li>VMFS/RDMs </li></ul><ul><ul><li>Round Robin policy for NMP is default BP on most storage platforms </li></ul></ul><ul><ul><li>PowerPath/VE further simplifies/automates multipathing on all EMC (and many non-EMC) platforms. </li></ul></ul><ul><ul><ul><li>Notably supports MSCS/WSFC including vMotion and VM HA </li></ul></ul></ul><ul><li>NFS </li></ul><ul><ul><li>For load balancing, distribute VMs across multiple datastores on multiple I/O paths. Follow the resiliency procedure in the TechBook to ensure VM resiliency to storage failover and reboot over NFS </li></ul></ul>
    29. 29. “ C” Best Practices circa 2010/2011 Track Alignment
    30. 30. “ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisfy IO request </li></ul><ul><li>Affects every protocol, and every storage array </li></ul><ul><ul><li>VMFS on iSCSI, FC, & FCoE LUNs </li></ul></ul><ul><ul><li>NFS </li></ul></ul><ul><ul><li>VMDKs & RDMs with NTFS, EXT3, etc </li></ul></ul><ul><li>Filesystems exist in the datastore and VMDK </li></ul>Chunk Chunk Chunk VMFS 1MB-8MB Array 4KB-64KB Block Datastore Alignment VMFS 1MB-8MB Array 4KB-64KB
    31. 31. “ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisfy IO request </li></ul><ul><li>Affects every protocol, and every storage array </li></ul><ul><ul><li>VMFS on iSCSI, FC, & FCoE LUNs </li></ul></ul><ul><ul><li>NFS </li></ul></ul><ul><ul><li>VMDKs & RDMs with NTFS, EXT3, etc </li></ul></ul><ul><li>Filesystems exist in the datastore and VMDK </li></ul>VMFS 1MB-8MB Array 4KB-64KB Datastore Alignment Chunk Chunk Chunk Block VMFS 1MB-8MB Array 4KB-64KB
    32. 32. “ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisfy IO request </li></ul><ul><li>Affects every protocol, and every storage array </li></ul><ul><ul><li>VMFS on iSCSI, FC, & FCoE LUNs </li></ul></ul><ul><ul><li>NFS </li></ul></ul><ul><ul><li>VMDKs & RDMs with NTFS, EXT3, etc </li></ul></ul><ul><li>Filesystems exist in the datastore and VMDK </li></ul>VMFS 1MB-8MB Array 4KB-64KB Guest Alignment Cluster Chunk Cluster Chunk Cluster Chunk Block FS 4KB-1MB VMFS 1MB-8MB Array 4KB-64KB
    33. 33. “ Alignment = good hygiene” <ul><li>Misalignment of filesystems results in additional work on storage controller to satisfy IO request </li></ul><ul><li>Affects every protocol, and every storage array </li></ul><ul><ul><li>VMFS on iSCSI, FC, & FCoE LUNs </li></ul></ul><ul><ul><li>NFS </li></ul></ul><ul><ul><li>VMDKs & RDMs with NTFS, EXT3, etc </li></ul></ul><ul><li>Filesystems exist in the datastore and VMDK </li></ul>Cluster Chunk Cluster Chunk Cluster Chunk Block VMFS 1MB-8MB Array 4KB-64KB Guest Alignment FS 4KB-1MB
    34. 34. Alignment – Best Solution: “Align VMs” <ul><li>VMware, Microsoft, Citrix, EMC all agree, align partitions </li></ul><ul><ul><li>Plug-n-Play Guest Operating Systems </li></ul></ul><ul><ul><ul><li>Windows 2008 , Vista , & Win7 </li></ul></ul></ul><ul><ul><ul><ul><li>They just work as their partitions start at 1MB </li></ul></ul></ul></ul><ul><ul><li>Guest Operating Systems requiring manual alignment </li></ul></ul><ul><ul><ul><li>Windows NT , 2000 , 2003 , & XP (use diskpart to set to 1MB) </li></ul></ul></ul><ul><ul><ul><li>Linux (use fdisk expert mode and align on 2048 = 1MB) </li></ul></ul></ul>
    35. 35. Alignment – “Fixing after the fact” <ul><li>VMFS is misaligned </li></ul><ul><ul><li>Occurs If you created the VMFS via CLI and not via vSphere client and didn ’t specify an offset. </li></ul></ul><ul><ul><li>Resolution: </li></ul></ul><ul><ul><ul><li>Step 1: Take an array snapshot/backup </li></ul></ul></ul><ul><ul><ul><li>Step 2: Create new datastore & migrate VMs using SVMotion </li></ul></ul></ul><ul><li>Filesystem in the VMDK is misaligned </li></ul><ul><ul><li>Occurs If you are are using older OSes and didn ’t align when you created the guest filesystem </li></ul></ul><ul><ul><li>Resolution: </li></ul></ul><ul><ul><ul><li>Step 1: Take an array snapshot/backup </li></ul></ul></ul><ul><ul><ul><li>Step 2: Use tools to realign (all VM to be shutdown) </li></ul></ul></ul><ul><ul><ul><ul><li>GParted (free, but some assembly required) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Quest vOptimizer (good mass scheduling and reporting) </li></ul></ul></ul></ul>
    36. 36. “ D” Best Practices circa 2010/2011 Utilize free vCenter plugins and VAAI
    37. 37. “ Leverage Free Plugins and VAAI” <ul><li>Use Vendor plug-ins for VMware vSphere </li></ul><ul><ul><li>All provide better visibility </li></ul></ul><ul><ul><li>Some provide integrated provisioning </li></ul></ul><ul><ul><li>Some integrate array features like VM snapshots, dedupe, compression and more </li></ul></ul><ul><ul><li>Some automate multipathing setup </li></ul></ul><ul><ul><li>Some automate best practices and remediation </li></ul></ul><ul><ul><li>Most are FREE </li></ul></ul>
    38. 38. VAAI <ul><li>vStorage APIs for Array Integration </li></ul>Block Zero What: 10x less IO for common tasks How: Eliminating redundant and repetitive write commands – just tell the array to repeat via SCSI commands Full Copy What: 10x faster VM deployment, clone, snapshot, and Storage VMotion How: leveraging array ability to mass copy, snapshot, and move blocks via SCSI commands Hardware Assisted Locking What: 10x more VMs per datastore How: stop locking LUNs and start only locking blocks.
    39. 39. “ What? VAAI isn’t working….” <ul><li>How do I know? </li></ul><ul><ul><li>Testing Storage VMotion/Cloning with no-offload versus Offload </li></ul></ul><ul><li>What do I do: </li></ul><ul><ul><li>Ensure running on VAAI compliant Array code level </li></ul></ul><ul><ul><li>Ensure the storage initiators for the ESX host are configured with ALUA on,– look at IO bandwidth in vSphere client and storage array </li></ul></ul><ul><ul><li>Benefit tends to be higher when svmotion across SPs </li></ul></ul><ul><ul><li>Biggest benefit isn ’t any single operation being faster, but rather overall system (vSphere, network, storage) load lightened </li></ul></ul>
    40. 40. “ E” Best Practices circa 2010/2011 Keep it Simple
    41. 41. “ Keep it Simple on Layout” <ul><li>Use VMFS and NFS together – no reason not to </li></ul><ul><li>Strongly consider 10GbE, particularly for new deployments </li></ul><ul><li>Avoid RDMs, use “Pools” (VMFS or NFS) </li></ul><ul><li>Make the datastores big </li></ul><ul><ul><li>VMFS – make them ~1.9TB in size (2TB – 512 bytes is the max for a single volume), 64TB for a single filesystem </li></ul></ul><ul><ul><li>NFS – make them what you want (16TB is the max) </li></ul></ul><ul><li>With vSphere 4.0 and later, you can have ___ VMs per VMFS datastore </li></ul><ul><li>On the array, default to Storage Pools, not traditional RAID Groups / Hypers / Metas </li></ul><ul><li>Default to single extent VMFS datastores </li></ul><ul><li>Default to Thin Provisioning models at the array level, optionally at the VMware level. </li></ul><ul><ul><li>Make sure you enable vCenter managed datastore alerts </li></ul></ul><ul><ul><li>Make sure you enable Unisphere/SMC thin provisioning alerts and auto-expansion </li></ul></ul><ul><li>Use “broad” data services – i.e. FAST, FAST Cache (things that are “set in one place”) </li></ul>
    42. 42. “ F” Best Practices circa 2010/2011 Use SIOC (if you can)
    43. 43. “ Use SIOC if you can” <ul><li>“ If you can” equals: </li></ul><ul><ul><li>vSphere 4.1, Enterprise Plus </li></ul></ul><ul><ul><li>VMFS (NFS targeted for future vSphere releases – not purely a qual) </li></ul></ul><ul><li>Enable it (not on by default), even if you don ’t use shares – will ensure no VM swamps the others </li></ul><ul><li>Bonus is you will get guest-level latency alerting! </li></ul><ul><li>Default threshold is 30ms </li></ul><ul><ul><li>Leave it at 30ms for 10K/15K, increase to 50ms for 7.2K, decrease to 10ms for SSD </li></ul></ul><ul><ul><li>Fully supported with array auto-tiering - leave it at 30ms for FAST pools </li></ul></ul><ul><li>Hard IO limits are handy for View use cases </li></ul><ul><li>Some good recommended reading: </li></ul><ul><ul><li>http://www.vmware.com/files/pdf/techpaper/VMW-vSphere41-SIOC.pdf </li></ul></ul><ul><ul><li>http://virtualgeek.typepad.com/virtual_geek/2010/07/vsphere-41-sioc-and-array-auto-tiering.html </li></ul></ul><ul><ul><li>http://virtualgeek.typepad.com/virtual_geek/2010/08/drs-for-storage.html </li></ul></ul><ul><ul><li>http://www.yellow-bricks.com/2010/09/29/storage-io-fairness/ </li></ul></ul>
    44. 44. Best Practices circa 2010/2011 General ‘Gotchas’
    45. 45. “ My storage team gives me tiny devices” <ul><li>How do I know? </li></ul><ul><ul><li>“ My storage team can only give us 240GB” </li></ul></ul><ul><li>What do I do: </li></ul><ul><ul><li>This means you have an “oldey timey” storage team  </li></ul></ul><ul><ul><li>Symmetrix uses hyper devices, and hypers are assembled into meta devices (which then are presented to hosts) </li></ul></ul><ul><ul><li>Hyper devices have a maximum of 240GB </li></ul></ul><ul><ul><li>Configuring meta devices is EASY </li></ul></ul>
    46. 46. “ My NFS based VM is impacted following a storage reboot or failover” <ul><li>How do I know? </li></ul><ul><li>VM freezes or, even worse, crashes </li></ul><ul><li>What do I do: </li></ul><ul><ul><li>Check your ESX NFS timeout settings compared to your storage partners recommendations (EMC – see the techbook ) </li></ul></ul><ul><ul><li>Review your VM and guest OS settings for resiliency. See TechBook for detailed procedure on VM resiliency </li></ul></ul>
    47. 47. Best Practices circa 2010/2011 When do the best practices not apply?
    48. 48. 5 Exceptions to the rules <ul><li>Create “planned datastore designs” (rather than big pools and correct after the fact) for larger IO use cases (View, SAP, Oracle, Exchange) </li></ul><ul><ul><li>Use the VMware + Array Vendor reference architectures – bake the cake </li></ul></ul><ul><ul><li>Over time, SIOC may prove to be a good approach </li></ul></ul><ul><ul><li>Some relatively rare cases where large spanned VMFS datastores make sense </li></ul></ul><ul><li>When NOT to used “datastore pools”, but pRDMs (narrow use cases!) </li></ul><ul><ul><li>MSCS/WSFC </li></ul></ul><ul><ul><li>Oracle – pRDMs and NFS can do rapid VtoP with array snapshots </li></ul></ul><ul><li>When NOT to use NMP Round Robin </li></ul><ul><ul><li>Arrays that are not active/active AND use ALUA using only SCSI-2 </li></ul></ul><ul><li>When NOT to use array thin-provisioned devices </li></ul><ul><ul><li>Datastores with extremely high amount of small block random IO </li></ul></ul><ul><ul><li>In FLARE 30, always use storage pools, LUN migrate to thick devices if needed </li></ul></ul><ul><li>When NOT to use the vCenter plugins? Trick question – always “yes” </li></ul>
    49. 49. THANK YOU – AND COME & PLAY! <ul><li>Win a 320GB eGo Drive at the booth! </li></ul>Lab 1: EMC vCenter Plugin Tour Lab 2: Virtual Storage Integrator Lab 3: vStorage APIs (VAAI) with CLARiiON Lab 4: VPLEX GUI Tour Lab 5: UIM v2 Tour Lab 6: Unisphere GUI Tour Hands on Labs in Room 101 H
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×