VMware vSphere 4.1 deep dive - part 1


Published on

This is a level 200 - 300 presentation.
It assumes:
Good understanding of vCenter 4, ESX 4, ESXi 4.
Preferably hands-on
We will only cover the delta between 4.1 and 4.0
Overview understanding of related products like VUM, Data Recovery, SRM, View, Nexus, Chargeback, CapacityIQ, vShieldZones, etc
Good understanding of related storage, server, network technology
Target audience
VMware Specialist: SE + Delivery from partners

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Isn’t cluster supported in 4.0.1? Compared the 2 manuals closely.Design here can mean better design, or you can fix/propose things that you can’t before, or give you more options to take on larger or more complex design.Cost here can mean lower Product cost, Services cost (e.g. reduce effort from partner) or less effort (if internal IT is doing it).Scalability means you can do more, like do more VM per ESX. Performance means can do the same thing but faster. For example, backing up a VM is faster.Memory Compression reduces cost: more VM per ESX means less ESX host, or smaller RAM expense.Scripted install improves security as it reduces risk of variance among installation.ESXi SAN boot improves security as ESXi config are not stored in a hundred places.vSphere 4.1 introduces an FT-specific versioning-control mechanism that allows the Primary and Secondary VMs to run on FT-compatible hosts at different but compatible patch levels. vSphere 4.1 differentiates between events that are logged for a Primary VM and those that are logged for its Secondary VM, and reports why a host might not support FT. In addition, you can disable VMware HA when FT-enabled VMs are deployed in a cluster, allowing for cluster maintenance operations without turning off FT.Compare with 4.0. The VMware HA dashboard in the vSphere Client provides a new detailed window called Cluster Operational Status. This window displays more information about the current VMware HA operational status, including the specific status and errors for each host in the VMware HA cluster.
  • Hyper-V import: without it, it will be more complex and may require longer down time.ESX 4.1 takes advantage of deep sleep states to further reduce power consumption during idle periods. The vSphere Client has a simple user interface that allows you to choose one of four host power management policies. In addition, you can view the history of host power consumption and power cap information on the vSphere Client Performance tab on newer platforms with integrated power meters. Need screenshot and new machine.Faster vMotion improves management as you spend less time waiting for 10 VMs to complete vMotion as you prepare to do hardware maintenance.In some cases, you are given a fixed window to do your maintenance. And you want the 5 or 15 VMs in that host to vmotion as fast as possible.vSphere 4.1 reduces the amount of overhead memory required, especially when running large VMs on systems with CPUs that provide hardware MMU support (AMD RVI or Intel EPT).vSphere 4.1 includes an AMD Opteron Gen. 3 (no 3DNow!™) EVC mode that prepares clusters for vMotion compatibility with future AMD processors. EVC also provides numerous usability improvements, including the display of EVC modes for VMs, more timely error detection, better error messages, and the reduced need to restart VMsVmware Tools now have CLI, which
  • VMware Data Recovery is actually available in 4.0.1 too, as it’s compatibleVMFS enhancements: minor. Transparent to usersThere have been many algorithm changes between v3.33 and and 3.46 VMFS-3.46 driver uses hardware accelerated locking and hardware accelerated Storage VMotion, Virtual Machine provisioning, and cold migrate functions on such hardware. This improved the performance and scalability of workloads that require the above functions.Personally, there are those who are 100% convinced on the benefit of iSCSI boot. This is because it’s mixing storage and network, and can make troubleshooting/support complex.VADP: VSS on Win08NFS performance improvement. Quantified?NFS Performance Enhancements. Networking performance for NFS has been optimized to improve throughput and reduce CPU usage
  • Nexus is not released yet.vDS: scalabilityvNIC enhancements: E1000 vNIC supports jumbo frames
  • You can use Host Profiles to roll out administrator password changes in vSphere 4.1. Enhancements also include improved Cisco Nexus 1000V support and PCI device ordering configurationUnattended Authentication in vSphere Management Assistant (vMA). vMA 4.1 offers improved authentication capability, including integration with AD and commands to configure the connectionUpdate Manager 4.1 immediately sends critical notifications about recalled ESX and related patches. In addition, Update Manager prevents you from installing a recalled patch that you might have already downloaded. This feature also helps you identify hosts where recalled patches might already be installed.The License Reporting Manager provides a centralized interface for all license keys for vSphere 4.1 products in a virtual IT infrastructure and their respective usage. You can view and generate reports on license keys and usage for different time periods with the License Reporting Manager. A historical record of the utilization per license key is maintained in the vCenter database
  • an 8-vCPU SMP VM is considered wide on an Intel Xeon 55xx system because the processor has only four cores per NUMA node
  • ESXi was released around 2 years ago. Just sharing my experience as SE. In this short period of 2 years, the discussions that I have with customers or partners have progressed, from “what is ESXi” to “why should we use ESXi” to “we are using or planning to use ESXi”. For a platform software, it is doing very well since it needs to build its ecosystem.
  • We can say that vSphere 4.1 is the release for ESXi. In this release ESXi takes center stage. 4.1 is our strongest message that we are going toward ESXi as the sole hypervisor. A lot of customers, even some of the largest deployment, have decided to go ESXi going forward. If your customers have not, 4.1 is a good opportunity for you offer a migration services or hardware refresh.As SE, we also know that there are some features that we wish we have in the 4.0 release. For example, while the remote CLI helps, none of the Linux command works as the execution context is the VMA OS, not the ESXi kernel. And in some troubleshooting scenario, customers do need to issue linux command. Another thing we can’t do automatic installation and boot from network.
  • One of the most popular requests among customers is to improve the deployment and management of ESXi.First in the line is boot From SAN is now fully supported in ESXi 4.1. It was as only experimentally supported in ESXi 4.0. Boot from SAN will be supported for FC, iSCSI, and FCoE. For iSCSI and FCoE, it will depend upon hardware qualification, so please check the HCL and Release Notes when vSphere 4.1 is released.Dependent Hardware iSCSI means the card depends on VMware networking, and iSCSI configuration and management interfaces provided by VMware. So properties like IP, MAC, and other parameters used for the iSCSI sessions are configured from VMware GUI/CLI.http://www.vmware.com/resources/compatibility/info.php?deviceCategory=san&mode=san_introductionFor ESXi text installer we have a screen that warns if the user is trying to install image onto an existing data store. It will not prevent user from installing if he/she desires to do so. For scripted install, unless user specifies an override VMFS flag, scripted install will not proceed with installation when a user tries to install on an existing datastore. We will only support a booting of host on a unique LUN. This LUN *cannot be* shared by other hosts. User is expected to set proper LUN masking to avoid this scenario. If the luns were to be shared it could result in data corruption. ----------- copied from 3rd party site iSCSI SW boot: the only currently supported network card is the Broadcom 57711 10GBe NIC. When booting from software iSCSI the boot firmware on the network adapter logs into an iSCSI target. The firmware than saves the network and iSCSI boot parameters in the iBFT which is stored in the host’s memory. Before you can use iBFT you need to configure the boot order in your server’s BIOS so the iBFT NIC is first before all other devices. You than need to configure the iSCSI configuration and CHAP authentication in the BIOS of the NIC before you can use it to boot ESXi from. The ESXi installation media has special iSCSI initialization scripts that use iBFT to connect to the iSCSI target and present it to the BIOS. Once you select the iSCSI target as your boot device the installer copies the boot image to it. Once the media is removed and the host rebooted the iSCSI target is used to boot and the initialization script runs in first boot mode which configures the networking which afterwards is persistent.
  • Second features we have implemented is more choice during install. We can now do PXE boot, and we can script it too.Scripted Installation, the equivalent of Kickstart, is now available. The installer can boot over the network, and at that point you can also do an interactive installation, or else set it up to do a scripted installation. Both the installed image and the config file (called “ks.cfg”) can be obtained over the network using a variety of protocols. There is also an ability to specify preinstall, postinstall, and first-boot scripts. For example, the postinstall script can configure all the host settings, and the first boot script could join the host to vCenter. These three types of scripts run either in the context of the Tech Support Mode or in Python. The Tech Support Mode shell is a highly stripped down version of bash.You can start the scripted installation with a CD-ROM drive or over the network by using PXE booting. You cannot use scripted installation to install ESXi to a USB device
  • The media depot is a network-accessible location that contains the ESXi installation media. You can use HTTP/HTTPS, FTP, or NFS to access the depot. The depot must be populated with the entire contents of the ESXi installation DVD, preserving directory structure.If you are performing a scripted installation, you must point to the media depot in the script by including the install command with the nfs or url option.The following code snippet from an ESXi installation script demonstrates how to format the pointer to the media depot if you are using NFS:install nfs --server=example.com --dir=/nfs3/VMware/ESXi/41
  • The preboot execution environment (PXE) is an environment to boot computers using a network interfaceindependently of available data storage devices or installed OS. These topics discuss thePXELINUX and gPXE methods of PXE booting the ESXi installer.PXE uses DHCP and Trivial File Transfer Protocol (TFTP) to bootstrap an OS (OS) over a network.Network booting with PXE is similar to booting with a DVD, but it requires some network infrastructure anda machine with a PXE-capable network adapter. Once the ESXi installer is booted, it works like a DVD-based installation,except that the location of the ESXi installation media (the contents of the ESXi DVD) must be specified.A host first makes a DHCP request to configure its network adapter and then downloads and executes a kerneland support files. PXE booting the installer provides only the first step to installing ESXi. To complete theinstallation, you must provide the contents of the ESXi DVD either locally or on a networked server throughHTTP/HTTPS, FTP, or NFS.TFTP is a light-weight version of the FTP service, and is typically used only for network booting systems orloading firmware on network devices such as routers.If you do not use gPXE, you might experience issues while booting the ESXi installer on a heavily loadednetwork. This is because TFTP is not a robust protocol and is sometimes unreliable for transferring largeamounts of data. If you use gPXE, only the gpxelinux.0 binary and configuration file are transferred via TFTP.gPXE enables you to use a Web server for transferring the kernel and ramdisk required to boot the ESXi installer.If you use PXELINUX without gPXE, the pxelinux.0 binary, the configuration file, and the kernel and ramdiskare transferred via TFTP.Setting up a new DHCP server is not recommended if your network already has one. If multipleDHCP servers respond to DHCP requests, machines can obtain incorrect or conflicting IP addresses, or canfail to receive the proper boot information. Seek the guidance of a network administrator in your organizationbefore setting up a DHCP
  • Scripted Installation, the equivalent of Kickstart, will be supported on ESXi 4.1. The installer can boot over the network, and at that point you can also do an interactive installation, or else set it up to do a scripted installation. Both the installed image and the config file (called “ks.cfg”) can be obtained over the network using a variety of protocols. There is also an ability to specify preinstall, postinstall, and first-boot scripts. For example, the postinstall script can configure all the host settings, and the first boot script could join the host to vCenter. These three types of scripts run either in the context of the Tech Support Mode shell (which is a highly stripped down version of bash) or in Python.
  • The firstboot scripts are run as initscripts. All initscripts have a numerical part in their filenames. They are sorted by that numerical part to determine the order in which they are run. So a script with "90.1" would run after a script with "90.0" and before a script with "90.2"
  • Finally, the Tech Support Mode is fully supported. We support both the local, when you are in front of the server, or remote, when you are using SSH.In ESXi 4.0, Tech Support Mode usage was ambiguous. We stated that you should only use it with guidance from VMware Support, but VMware also issued several KBs telling customers how to use it. Getting into Tech Support Mode was also not very user-friendly.The warning not to use TSM has been removed from the login screen. However, anytime TSM is enabled (either local or remote), a warning banner will appear in vSphere Client for that host. This is meant to reinforce the recommendation that TSM only be used for fixing problems, not on a routine basis.The SysAdminTools URL in the message above will take you to vMA, PowerCLI, CLI, etc.
  • To enable or disable from the console, it’s pretty straight forward. By default, after you enable TSM (both local and remote), they will automatically become disabled after 10 minutes. This time is configurable, and the timeout can also be disabled entirely. When TSM times out, running sessions are not terminated, allowing you to continue a debugging session. All commands issued in TSM are logged by hostd and sent to syslog, allowing for an incontrovertible audit trail.When lockdown mode is enabled, DCUI access is restricted to the root user (so root can still go in), while access to Tech Support Mode is completely disabled for all users. With lockdown mode enabled, access to the host for management or monitoring using CIM is possible only through vCenter. Direct access to the host using the vSphere Client is not permitted.
  • As you know, the tech support mode is not for day to day use. So anytime it is enabled, we will flag it.
  • We can also enable it via the GUI. You select the ESXi you want to manage, then click on the “Configuration” tab. From here, click on the “Security Profile”. Clicking on the properties brings up this dialog box. From here, we can stop and start the relevant services.
  • Procedure:1 Log in to the host from the vSphere Client.2 From the Configuration tab, select Advanced Settings.3 From the Advanced Settings window, select Annotations.4 Enter a security message.The message is displayed on the direct console Welcome screen.
  • There is now an ability to totally lock down a host. Lockdown mode in ESXi 4.1 forces all remote access to go through vCenter. So Lockdown mode is only available on ESXi hosts that have been added to vCenter.
  • The only local access is for root to access the DCUI – this could be used, for example, to turn off lockdown mode in case vCenter is down. However, there is an option to disable DCUI in vCenter. In this case, with Lockdown mode turned on, there is no possible way to manage the host directly – everything must be done through vCenter. If vCenter is down, the only recourse in this case is to reimage the box.Of course, Lockdown Mode can be selectively disable for a host if there is a need to troubleshoot or fix it via TSM, and then enabled again.BTW,
  • Vscsistats has also been ported and now is available directly in the ESXi console.It is an advanced commands, and can be used to identify the IO patterns.
  • Other useful utilities for troubleshooting have been added to TSM
  • You can add multiple USB devices, such as security dongles and mass storage devices, to a VMthat resides on an ESX/ESXi host to which the devices are physically attached. Knowledge of devicecomponents and their behavior, VM requirements, feature support, and ways to avoid data losscan help make USB device passthrough from an ESX/ESXi host to a VM successful.When you attach a USB device to a physical host, the device is available only to VMs that resideon that host. Those VMs cannot connect to a device on another host in the datacenter.A USB device is available to only one VM at a time. When you remove a device from a virtualmachine, it becomes available to other VMs that reside on the host.USB Arbitrator Manages connection requests and routes USB device traffic. The arbitrator isinstalled and enabled by default on ESX/ESXi hosts. It scans the host for USBdevices and manages device connection among VMs that reside onthe host. It routes device traffic to the correct VM instance fordelivery to the guest OS. The arbitrator monitors the USB deviceand prevents other VMs from using it until you release it from theVM it is connected to.If vCenter polling is delayed, a device that is connected to one virtualmachine might appear as though it is available to add to another virtualmachine. In such cases, the arbitrator prevents the second VM fromaccessing the USB device.USB Controller The USB hardware chip that provides USB function to the USB ports that itmanages. The virtual USB Controller is the software virtualization of the USBhost controller function in the VM.USB controller hardware and modules that support USB 2.0 and USB 1.1devices must exist on the host. Only one virtual USB controller is available toeach VM. The controller supports multiple USB 2.0 and USB 1.1USB devices in the virtual computer. The controller must be present before youcan add USB devices to the virtual computer.The USB arbitrator can monitor a maximum of 15 USB controllers. Devicesconnected to controllers numbered 16 or greater are not available to the virtualMachineBefore you hot add memory, CPU, or PCI devices, you must remove any USB devices. Hot adding theseresources disconnects USB devices, which might result in data loss.n Before you suspend a VM, make sure that a data transfer is not in progress. During thesuspend/resume process, USB devices behave as if they have been disconnected, then reconnected. Also,if you use vMotion to migrate a VM away from the host that the USB device is attached to, itwon't be reconnected when the VM is resumedFor compound devices, the virtualization process filters out the USB hub so that it is not visible to the virtualmachine. The remaining USB devices in the compound appear to the VM as separate devices. Youcan add each device to the same VM or to different VMs if they reside on the samehost.
  • Another feature that was requested a lot is to integrate with MS AD. This further simplify the management of vSphere as we can now be consistent with vCenter.AD integration provides authentication for all local services. This means access via Admin Client, via the console, via remote console are all based on AD.ESX and ESXi should integrate with MS AD for all user authentication. This effectively removes static information from the ESX host and enables the "plug and play" and "stateless appliance" concepts. Customers do not want to manage user accounts on ESX or ESXi because it is additional work to what they would do in a physical environment. Lowers the Opex of managing a VI environment and also competitively positions our platform with Hyper-V which can do this today. Customers don’t want to rely on VC for these functions due to HA of VC.
  • So how do we do it? One way is to select the ESX that you want to add to AD, and choose the “Configuration” tab. From this page, choose the “authentication service” link. Click on the properties link, the dialog box shown on the next slide is shown.
  • From the dialog box that pops up, select “AD” from the drop down.Then specify the Domain name.Then click “Join Domain”. The next dialog box will pop up to let you enter the ID which can join a domain. Click on Join Domain button to join the domain. If there is an error, an error message will be prompted. If not, ESXi will join the domain.
  • I guess a question from customer will be how they can do this automatically, if they have a lot of ESXi and not enough Sys Admin to manage all these things.We have enhanced our host profile. Here is the screen where we can configure the same info in the host profiles.
  • The idea of memory compression is very straightforward: if the swapped out pages can be compressed and stored in a compression cache located in the main memory, the next access to the page only causes a page decompression which can be an order of magnitude faster than the disk access. With memory compression, only a few uncompressible pages need to be swapped out if the compression cache is not full. This means the number of future synchronous swap-in operations will be reduced. Hence, it may improve application performance significantly when the host is in heavy memory pressure. In ESX 4.1, only the swap candidate pages will be compressed. This means ESX will not proactively compress guest pages when host swapping is not necessary. In other words, memory compression does not affect workload performance when host memory is undercommitted. 3.5.1 Reclaiming Memory Through Compression Figure 8 illustrates how memory compression reclaims host memory compared to host swapping. Assuming ESX needs to reclaim two 4KB physical pages from a VM through host swapping, page A and B are the selected pages (Figure 8a). With host swapping only, these two pages will be directly swapped to disk and two physical pages are reclaimed (Figure 8b). However, with memory compression, each swap candidate page will be compressed and stored using 2KB of space in a per-VM compression cache. Note that page compression would be much faster than the normal page swap out operation which involves a disk I/O. Page compression will fail if the compression ratio is less than 50% and the uncompressible pages will be swapped out. As a result, every successful page compression is accounted for reclaiming 2KB of physical memory. As illustrated in Figure 8c, pages A and B are compressed and stored as half-pages in the compression cache. Although both pages are removed from VM guest memory, the actual reclaimed memory size is one page. If any of the subsequent memory access misses in the VM guest memory, the compression cache will be checked first using the host physical page number. If the page is found in the compression cache, it will be decompressed and push back to the guest memory. This page is then removed from the compression cache. Otherwise, the memory request is sent to the host swap device and the VM is blocked. The per-VM compression cache is accounted for by the VM’s guest memory usage, which means ESX will not allocate additional host physical memory to store the compressed pages. The compression cache is transparent to the guest OS. Its size starts with zero when host memory is undercommitted and grows when VM memory starts to be swapped out. If the compression cache is full, one compressed page must be replaced in order to make room for a new compressed page. An age-based replacement policy is used to choose the target page. The target page will be decompressed and swapped out. ESX will not swap out compressed pages. If the pages belonging to compression cache need to be swapped out under severe memory pressure, the compression cache size is reduced and the affected compressed pages are decompressed and swapped out. The maximum compression cache size is important for maintaining good VM performance. If the upper bound is too small, a lot of replaced compressed pages must be decompressed and swapped out. Any following swap-ins of those pages will hurt VM performance. However, since compression cache is accounted for by the VM’s guest memory usage, a very large compression cache may waste VM memory and unnecessarily create VM memory pressure especially when most compressed pages would not be touched in the future. In ESX 4.1, the default maximum compression cache size is conservatively set to 10% http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_memory_mgmt.pdfNote that this paper is written based on ESX4.0 memory management paper. Besides the new content introduced in ESX4.1, e.g., memory compression, quite a few places have been updated to represent the state of the art of ESX memory management.
  • What does the counter _really_ mean as it’s an _average_ of a _rate_?
  • Esxtop also has a power view “p”
  • (2) The feature of displaying per-VM power consumption is experimentaland off by default. It can be turned on with an advanced config optionas the paragraph describes. The per-VM power consumption feature isdependent on the host power consumption feature
  • HA and DRS have always been the popular features among our customers. I have quite a number of customers who found that HA is good enough for their SLA and moved from MS clustering. In the 4.1, we have a couple of enhancements in these main features.
  • Give tips on HATypes of cluster: prod, dmz, tier 2, IT cluster, non prod, desktop, why min host is 4This slides give a summary of the new enhancements. As customers adopt more and more virtualisation, we are entering the phase where mission critical workloads are virtualised. With all these enhancements in 4.1, customers may be tempted to create large clusters and put everything there. By large I mean either large no of nodes, or a lot of VMs in single cluster. Personally, I still prefer the traditional approach, where a cluster is really the building block. So we have multiple clusters, with distinct purpose.From the list above, something that I think customers will appreciate is the
  • In the past, customers reported that they very occasionally saw DRS "get it wrong" in the sense that DRS would move VMs based on purely performance criteria with scant regard for the availability anxiety. What this means is, in the past it was possible (if somewhat unlikely) for DRS to place 20 VMs on an ESX host and only put 8 VMs on another. While that may have been a good idea from a performance standpoint, it could lead to scenarios where DRS itself created an "eggs in one basket" scenario, as DRS didn't distribute VMs to prevent one ESX host from becoming overpopulated (and with a bigger VM count) than another. In this scenario, DRS would have to carry out VMotions to free up resources so HA can power on a VM.
  • For Application Monitoring, developers would develop application monitoring agents using the Application Monitoring SDK for specific applications running in the VM. There is support added in VMware Tools for an application to report its heartbeat/status. This gets communicated to vCenter as an "AppHeartbeatStatus" (similar to the "GuestHeartbeatStatus"). HA can respond to that by going red, indicating that the application has died. Thus, Application monitoring would work for those applications that use the new VMware Tools capability along with an application monitoring agent to report application status. To enable Application MonitoringObtain the SDK from VMware (this is for the ISV, not end customers)Use it to set up customized heartbeats for the applications you want to monitor.
  • Since the hypervisor has full control over the execution of a VM, including delivery of all inputs, the hypervisoris able to capture all the necessary information about non-deterministic operations on theprimary VM and to replay these operations correctly on the backup VM.The tagging scheme doesn’t introduce any significant delay of the replaying VM, since the hypervisorof the recording (primary) VM guarantees that last log entry of each single instructionemulation or a device operation is marked as a go-live point. Since the backup VM cannotbe significantly delayed, the primary VM is also not affected by the use of go-live points
  • Patches can cause host build numbers to vary between ESX and ESXi installations. To ensure that your hosts are FT compatible, do not mix ESX and ESXi hosts in an FT pair.
  • FT with vSphere 4.1 still has some incompatibilities Thin Provisioning and Linked ClonesHot plug devices and USB PassthroughIPv6 (as HA does not support)vSMPN-Port ID Virtualization (NPIV)StorageVMotionSerial/ parallel portsPhysical and remote CD/ floppy
  • Business opportunity: migrate customer from clustering (running 2 instance) to FT, where we have higher up time
  • #1: If administrators wanted to move an ESX hosts from one vCenter instance to a new one (for whatever reasons) they usually did not bring the ESX host into maintenance mode.But adding the host to the new vCenter server without removing it from the previous one caused FT failures.Now the administrator will get a warning- which can be followed or ignored (yes/no) if he tries to add an ESX host which is managed by a different vCenter.#2: DRS will vMotion FT enabled VMs if needed and will place them according to DRS groups and other rules. Storage vMotion is not supported with FT, though.#3: If an administrator disabled HA he was forced to disable FT first. Now he gets a warning and he can decide to override and accept FT will not work as expected. Following this decision several operations re to FT will be disabled while HA is off.
  • So how do we do it? We can now create 2 types of group: groups of VM and groups of ESX.We then map the VM group to the ESX group
  • An ESX host can belong to multiple group?
  • The separate rules now include more than only two VMs.If you select a “Separate rule” and include 5 VMs you’ll need at least 5 ESX hosts to accommodate this rule as each of them must run on separate host.
  • vMotion is not a cluster feature. We can vMotion across cluster.? Can we vMotion from 2 clusters with different EVC? We can try this on the lab.We should be able to vMotion from 4.0 to 4.1, as we can do from 3.5 to 4.1
  • This sound quite complicated but is easy to understand.Assumimg a VM was powered on on an older EVC mode and migrated (without powering off) to a cluster with a newer mode (and newer feature).So in this case the VM is “part” of the new EVC mode, but does not use the new features- instead still the old ones.Previously if you tried to vMotion this VM to and ESX host with the older EVC mode vCenter complained about them not being compatible- as the ESX host was not compatible to the current EVC mode the VM is running.Now it checks which mode the VM itself uses and accepts vMotioning to an older mode- as the VM doesn’t care and is still not using the new features.
  • Earlier Add-Host Error detection: Host-specific incompatibilities are now displayed prior to the Add-Host work-flow when adding a host into an EVC cluster.
  • in the vSphere VM Administation Guide page 92 vmware writes: "You can verify the CPU settings for the VM on the Resource Allocation tab.“But in this menu you can see no indication to the multi core configuration. what do I have to look for? Is it already implemented in the vSphere 4.1 RC ?When you configure multicore virtual CPUs for a VM, CPU hot Add/remove is disabled.For more information about multicore CPUs, see the vSphere Resource Management Guide. You can also searchthe VMware KNOVA database for articles about multicore CPUshttp://www.cpuid.com/softwares/cpu-z.html provides a more detailed view within each Guest OS________________Need to see if we can use Orchestrator or PowerShell to check this
  • Need to see the PowerCLI and vSphere API to see if we can do this programmatically
  • Note that “Average Capacity” in the report refers to the average capacity of all license keys for that product. Products (e.g. vSphere Enterprise) can have multiple keysEach key has a capacity and usage associated with it.In the screen above:Current capacity is total capacity for all the keysAverage capacity is the average capacity for the keys. For example…Product: vSphere Enterprisekey | capacity | usagexxxx-xxxx—xxxx | 1000 | 500yyyy-xxxx—xxxx | 2000 | 100 For the product, vSphere Enterprise we would report:Total Capacity  - 3000Total Usage – 600Average Usage – 300Average Capacity – 1500
  • VMware vSphere 4.1 deep dive - part 1

    1. 1. vSphere 4.1: Delta to 4.0Tech Sharing for Partners<br />Iwan ‘e1’ Rahabok, Senior Systems Consultant<br />e1@vmware.com | virtual-red-dot.blogspot.com | tinyurl.com/SGP-User-Group | facebook.com/e1ang<br />August 2010<br />
    2. 2. Audience Assumption<br />This is a level 200 - 300 presentation.<br />It assumes:<br />Good understanding of vCenter 4, ESX 4, ESXi 4. <br />Preferably hands-on<br />We will only cover the delta between 4.1 and 4.0<br />Overview understanding of related products like VUM, Data Recovery, SRM, View, Nexus, Chargeback, CapacityIQ, vShieldZones, etc<br />Good understanding of related storage, server, network technology<br />Target audience<br />VMware Specialist: SE + Delivery from partners<br />
    3. 3. Agenda<br />New features<br />Server<br />Storage<br />Network<br />Management<br />Upgrade<br />
    4. 4. 4.1 New Feature (over 4.0, not 3.5): Server<br />
    5. 5. 4.1 New Feature (over 4.0, not 3.5): Server<br />
    6. 6. 4.1 New Feature (over 4.0, not 3.5): Storage<br />
    7. 7. 4.1 New Feature (over 4.0, not 3.5): Network<br />
    8. 8. 4.1 New Feature: Management<br />
    9. 9. Builds:<br />ESX build 260247<br />VC build 258902<br />Some stats:<br />4000 development weeks were spent to get to FC<br />5100 QA weeks were spent to get to FC<br />872 beta customers downloaded and tried it out<br />2012 servers, 2277 storage arrays, and 2170 IO devices are already on the HCL<br /> <br />
    10. 10. Consulting Services: Kit<br />The vSphere Fundamentals services kit<br />Includes core services enablement materials for vSphere Jumpstarts, Upgrades, Converter/P2V and PoCs.  <br />The update reflects what’s new in vSphere 4.1 - including new resource limits, memory compression, Storage IO Control, vNetwork Traffic Management, and vSphere Active Directory Integration. <br />The kit is intended for use by PSO Consultants, TAMs, and SEs to help with delivering services engagements, PoCs, or knowledge transfer sessions with customers. <br />Located at Partner Central – Services IP Assets<br />https://na6.salesforce.com/sfc/#version?selectedDocumentId=069800000000SSi<br />For delivery partner: <br />Please <br />download this.<br />
    11. 11. 4.1 New Features: Server<br />
    12. 12. PXE Boot Retry<br />Virtual Machine -> Edit Settings -> Options -> Boot Options<br />Failed Boot Recovery disabled by default<br />Enable and set the automatically retry boot after X Seconds<br />12<br />
    13. 13. Wide NUMA Support<br />Wide VM<br />Wide-VM is defined as a VM that has more vCPUs than the available cores on a NUMA node. <br />A 5-vCPU VM in a quad-core server<br />Only the cores count, and hyperthreading threads don’t<br />ESX 4.1 scheduler introduces wide-VM NUMA support<br />Improves memory locality for memory-intensive workloads. <br />Based on testing with micro benchmarks, the performance benefit can be up to 11–17%.<br />How it works<br />ESX 4.1 allows wide-VMs to take advantage of NUMA management. NUMA management means that a VM is assigned a home node where memory is allocated and vCPUs are scheduled. By scheduling vCPUs on a NUMA node where memory is allocated, the memory accesses become local, which is faster than remote accesses<br />
    14. 14. ESXi<br />Enhancements to ESXi. Not applicable to ESX<br />
    15. 15. Transitioning to ESXi<br />ESXi is our architecturegoing forward<br />
    16. 16. Moving toward ESXi<br />Permalink to: VMware ESX and ESXi 4.1 Comparison<br />Service Console (COS)<br />Agentless vAPI-based<br />Management Agents<br />Hardware Agents<br />Agentless CIM-based<br />Commands forconfiguration anddiagnostics<br />vCLI, PowerCLI<br />Local Support Console<br />CIM API<br />vSphere API<br />Infrastructure<br />Service Agents<br />Native Agents:NTP, Syslog, SNMP<br />VMware ESXi<br />“Classic” VMware ESX<br />
    17. 17. Software Inventory - Connected to ESXi/ESX<br />From vSphere 4.1<br />Before<br />Enumerate instance of CIM_SoftwareIdentity<br />Enhanced CIM provider now displays great detail on installed software bundles.<br />
    18. 18. 18<br />Software Inventory – Connected to vCenter<br />Before<br />From vSphere 4.1<br />Enumerate instance of CIM_SoftwareIdentity<br /><ul><li>Enhanced CIM provider now displays great detail on installed software bundles.</li></li></ul><li>Additional Deployment Option<br />Boot From SAN<br />Fully supported in ESXi 4.1<br />Was only experimentally supported in ESXi 4.0<br />Boot from SAN supported for FC, iSCSI, and FCoE<br />ESX and ESXi have different requirement:<br />iBFT (Boot Firmware Table) required<br />The host must have an iSCSI boot capable NIC that supports the iSCSI iBFT format. <br />iBFT is a method of communicating parameters about the iSCSI boot device to an OS<br />
    19. 19. Additional Deployment Option<br />Scripted Installation<br />Numerous choices for installation<br />Installer booted from<br />CD-ROM (default)<br />Preboot Execution Environment (PXE)<br />ESXi Installation image on<br />CD-ROM (default), HTTP/S, FTP, NFS<br />Script can be stored and accessed<br />Within the ESXi Installer ramdisk<br />On the installation CD-ROM<br />HTTP / HTTPS, FTP, NFS <br />Config script (“ks.cfg”) can include<br />Preinstall<br />Postinstall<br />First boot<br />Cannot use scripted installation to install to a USB device<br />
    20. 20. PXE Boot<br />Requirements<br />PXE-capable NIC.<br />DHCP Server (IPv4). Use existing one.<br />Media depot + TFTP server + gPXE<br />A server hosting the entire content of ESXi media. <br />Protocal: HTTP/HTTPS, FTP, or NFS server.<br />OS: Windows/Linux server.<br />Info<br />We recommend the method that uses gPXE. If not, you might experience issues while booting the ESXi installer on a heavily loaded Network.<br />TFTP is a light-weight version of the FTP service, and is typically used only for network booting systems or loading firmware on network devices such as routers.<br />
    21. 21. PXE boot<br />PXE uses DHCP and Trivial File Transfer Protocol (TFTP) to bootstrap an OS over network.<br />How it works<br />A host makes a DHCP request to configure its NIC. <br />A host downloads and executes a kernel and support files. PXE booting the installer provides only the first step to installing ESXi. <br />To complete the installation, you must provide the contents of the ESXi DVD <br />Once ESXi installer is booted, it works like a DVD-based installation, except that the location of the ESXi installation media must be specified.<br />
    22. 22. Additional Deployment Option<br />
    23. 23. Sample ks.cfg file<br /># Accept the EULA (End User Licence Agreement)<br />vmaccepteula<br /># Set the root password to vmware123<br />rootpw vmware123<br /># Install the ESXi image from CDROM<br />install cdrom<br /># Auto partition the first disk – if a VMFS exists it will overwrite it.<br />autopart --firstdisk --overwritevmfs<br /># Create a partition called Foobar<br /># Partition the disk identified with vmhba1:c0:t1:l0 to grow to a maxsize of 4000<br />partition Foobar --ondisk=mpx.vmhba1:C0:T1:L0 --grow –maxsize=4000<br /># Set up the management network on the vmnic0 using DHCP<br />network –bootproto=dhcp --device=vmnic0 --addvmportgroup=0<br />%firstboot --level=90.1 --unsupported --interpreter=busybox<br /># On this first boot, save the current date to a temporary file<br />date > /tmp/foo<br /># Mount an nfs share and put it at /vmfs/volumes/www<br />esxcfg-nas -add -host -share /var/www www<br />
    24. 24. Full Support of Tech Support Mode<br />There you go <br />2 types<br />Remote: SSH<br />Local: Direct Console<br />
    25. 25. Full Support of Tech Support Mode<br />Enter to toggle. That’s it!<br />Disable/Enable <br />Timeout automatically disables TSM (local and remote)<br />Running sessions are not terminated.<br />All commands issued in Tech Support Mode are sent to syslog<br />
    26. 26. Full Support of Tech Support Mode<br />Recommended uses<br />Support, troubleshooting, and break-fix<br />Scripted deployment preinstall, postinstall, and first boot scripts<br />Discouraged uses<br />Any other scripts<br />Running commands/scripts periodically (cron jobs)<br />Leaving open for routine access or permanent SSH connection<br />Admin will benotified when active<br />
    27. 27. Full Support of Tech Support Mode<br />We can also enable it via GUI<br />Can enable in vCenter or DCUI<br />Enable/Disable<br />
    28. 28. Security Banner<br />A message that is displayed on the direct console Welcome screen.<br />
    29. 29. Total Lockdown<br />
    30. 30. Total Lockdown<br /><ul><li>Ability to totally control local access via vCenter</li></ul>DCUI<br />Lockdown Mode (disallows all access except root on DCUI)<br />Tech Support Mode (local and remote)<br />If all configured, then no local activity possible (except pull the plugs)<br />
    31. 31. Additional commands in Tech Support Mode<br />vscsciStats is now available in the console.<br />Output is raw data for histogram.<br />Use spreadsheet to plot the histogram<br />Some use cases:<br />Identify whether IO are sequential or random<br />Optimizing for IO Sizes<br />Checking for disk mis-alignment<br />Looking at storage latency in moredetails<br />
    32. 32. Additional commands in Tech Support Mode<br />Additional commands for troubleshooting<br />nc (netcat)<br />http://en.wikipedia.org/wiki/Netcat<br />tcpdump-uw<br />http://en.wikipedia.org/wiki/Tcpdump<br />
    33. 33. More ESXi Services listed<br />More services are now shown in GUI.<br />Ease of control<br />For example, if SSH is not running, you can turn it on from GUI.<br />ESXi 4.0<br />ESXi 4.1<br />
    34. 34. ESXi Diagnostics and Troubleshooting<br /><ul><li> If things go wrong:
    35. 35. During normal operations:</li></ul>DCUI: misconfigs / restart mgmt agents <br />vCLI<br />vCenter <br />vSphere APIs<br />TSM: Advanced troubleshooting (GSS) <br />ESXi<br />Remote Access<br />Local Access<br />
    36. 36. Common Enhancements for both ESX and ESXi<br />64 bit User World<br />Running VMs with very large memory footprints implies that we need a large address space for the VMX. <br />32-bit user worlds (VMX32) do not have sufficient address space for VMs with large memory. 64-bit User worlds overcome this limitation.<br />NFS<br />The number of NFS volumes supported is increased from 8 to 64.<br />Fiber Channel<br />End-To-End Support for 8 GB (HBA, Switch & Array).<br />VMFS<br />Version changed to 3.46. No customer visible changes. Changes related to algorithms in the vmfs3 driver to handle new VMware APIs for Array Integration (VAAI).<br />
    37. 37. Common Enhancements for both ESX and ESXi<br />VMkernel TCP/IP Stack Upgrade<br />Upgraded to version based on BSD 7.1. <br />Result: improving FT logging, VMotion and NFS client performance.<br />Pluggable Storage Architecture (PSA)<br />New naming convention.<br />New filter plugins to support VAAI (vStorage APIs for Array Integration).<br />New PSPs (Path Selection Policies) for ALUA arrays.<br />New PSP from DELL for the EqualLogic arrays.<br />
    38. 38. USB pass-through<br />New Features for both ESX/ESXi<br />
    39. 39. USB Devices<br />2 steps:<br />Add USB Controller<br />Add USB Devices<br />
    40. 40. USB Devices<br />Only devices listed on the manual is supported.<br />Mostly for ISV licence dongle.<br />A few external USB drives.<br />Limited list of device for now<br />
    41. 41. Example 1<br />After vMotion, the VM will be on another (remote) ESXi.<br />Communication inter-ESXi will use Mgmt Network (ESXi has no SC network)<br />You cannot multi-select devices at this stage – add them one by one.<br /><ul><li>Source: http://vstorage.wordpress.com/2010/07/15/usb-passthrough-in-vsphere-4-1/</li></li></ul><li>Example 1<br />From the source<br />“I have tested numerous brands of USB mass storage devices (Kingston, Sandisk, Lexar, Imation) as well a couple of of security dongles and they all work well.”<br />
    42. 42. Example 2: adding UPS<br /><ul><li>Source: http://vninja.net/virtualization/using-usb-pass-through-in-vsphere-4-1/</li></li></ul><li>Example 2<br /><ul><li>Source: http://vninja.net/virtualization/using-usb-pass-through-in-vsphere-4-1/</li></li></ul><li>USB Devices: Supported Devices<br />
    43. 43. USB Devices<br />Up to 20 devices per VM. Up to 20 devices per ESX host.<br />1 device can only be owned by 1 VM at a given time. No sharing.<br />Supported<br />vMotion<br />Communication via the management network<br />DRS<br />Unsupported<br />DPM. DPM is not aware of the device and may turn it off. This may cause loss of data. So disable DRS for this VM so it stays in this host only.<br />Fault Tolerance<br />Design consideration<br />Take note of situation when the ESX host is not available (planned or unplanned downtime)<br />
    44. 44. MS AD integration<br />New Features for both ESX/ESXi<br />
    45. 45. AD Service<br />Provides authentication for all local services<br />vSphere Client<br />Other access based on vSphere API <br />DCUI<br />Tech Support Mode (local and remote)<br />Has nominal AD groups functionality<br />Members of “ESX Admins” AD group have Administrative privilege<br />Administrative privilege includes:<br />Full Administrative role in vSphere Client and vSphere API clients<br />DCUI access<br />Tech Support Mode access (local and remote)<br />
    46. 46. The Likewise Agent<br />ESX uses an agent from Likewise to connect to MS AD and to authenticate users with their domain credentials. <br />The agent integrates with the VMkernel to implement the mapping for applications such as the logon process (/bin/login) which uses a pluggable authentication module (PAM). <br />As such, the agent acts as an LDAP client for authorization (join domain) and as a Kerberos client for authentication (verify users).<br />The vMA appliance also uses an agent from Likewise.<br />ESX and vMA use different versions of the Likewise agent to connect to the Domain Controller. ESX uses version 5.3 whereas vMA uses version 5.1.<br />49<br />
    47. 47. Joining AD: Step 1<br />
    48. 48. Joining AD: Step 2<br />1. Select “AD”<br />2. Click “Join Domain”<br />3. Join the domain. Full name.<br />@123.com<br />
    49. 49. AD Service<br />A third method for joining ESX/ESXi hosts and enabling Authentication Services to utilize AD is to configure it through Host Profiles <br />
    50. 50. AD Likewise Daemons on ESX<br /><ul><li>lwiod is the Likewise I/O Manager service - I/O services for communication. Launched from /etc/init.d/lwiod script.
    51. 51. netlogond is the Likewise Site Affinity service - detects optimal AD domain controller, global catalogue and data caches. Launched from /etc/init.d/netlogond script.
    52. 52. lsassd is the Likewise Identity & Authentication service. It does authentication, caching and idmap lookups. This daemon depends on the other two daemons running. Launched from /etc/init.d/lsassd script.</li></ul>root 18015 1 0 Dec08 ? 00:00:00 /sbin/lsassd --start-as-daemon<br />root 31944 1 0 Dec08 ? 00:00:00 /sbin/lwiod --start-as-daemon<br />root 31982 1 0 Dec08 ? 00:00:02 /sbin/netlogond --start-as-daemon<br />
    53. 53. ESX Firewall Requirements for AD<br />Certain ports in SC are automatically opened in the Firewall Configuration to facilitate AD. <br />Not applicable to ESXi<br />Before<br />After<br />
    54. 54. Time Sync Requirement for AD<br />Time must be in sync between the ESX/ESXi server and the AD server. <br />For the Likewise agent to communicate over Kerberos with the domain controller, the clock of the client must be within the domain controller's maximum clock skew, which is 300 seconds, or 5 minutes, by default. <br />The recommendation would be that they share the same NTP server.<br />
    55. 55. vSphere Client<br />Now when assigning permissions to users/groups, the list of users and groups managed by AD can be browsed by selecting the Domain.<br />
    56. 56. Info in AD<br />The host should also be visible on the Domain Controller in the AD Computers objects listing.<br />Looking at the ESX Computer Properties shows a Name of RHEL(as it the Service Console on the ESX) & Service pack of ‘Likewise Identity 5.3.0’<br />
    57. 57. Memory Compression<br />New Features for both ESX/ESXi<br />
    58. 58. Memory Compression<br />VMKernel implement a per-VM compression cache to store compressed guest pages. <br />When a guest page (4 KB page) needs to swapped, VMKernel will first try to compress the page. If the page can be compressed to 2 KB or less, the page will be stored in the per-VM compression cache. <br />Otherwise, the page will be swapped out to disk. If a compressed page is again accessed by the guest, the page will decompressed online. <br />
    59. 59. Changing the value of cache size<br />
    60. 60. Virtual Machine Memory Compression<br />Virtual Machine -> Resource Allocation<br />Per-VM statistic showing compressed memory<br />
    61. 61. Monitoring Compression<br />3 new counters introduced to monitor<br />Host level, not VM level. <br />
    62. 62. Power Management<br />
    63. 63. Power consumption chart<br />Per ESX, not per cluster<br />Need hardware integration.<br />Difference HW makes have different info<br />
    64. 64. Performance Graphs – Power Consumption<br />We can now track the Power consumption of VMs in real-time<br />Enabled through Software Settings ->Advanced Settings -> Power -> Power.ChargeVMs<br />65<br />
    65. 65. Host power consumption<br />In some situation, may need to edit /usr/share/sensors/vmware to get support for the host<br />Different HW makers have different API.<br />VM power consumption<br />Experimental. Off by default<br />
    66. 66. ESX<br />Features only for ESX (not ESXi)<br />
    67. 67. ESX: Service Console firewall<br />Changes in ESX 4.1<br />ESX 4.1 introduces these additional configuration files located in /etc/vmware/firewall/chains:<br />usercustom.xml<br />userdefault.xml<br />Relationship between the 2 files<br />“user” overwrites.<br />The default files custom.xml and default.xml are overridden by usercustom.xml and userdefault.xml.<br />All configuration is saved in usercustom.xml and userdefault.xml.<br />Copy the original custom.xml and default.xml files. <br />Use them as a template for usercustom.xml and userdefault.xml.<br />
    68. 68. Cluster<br />HA, FT, DRS & DPM<br />
    69. 69. Availability Feature Summary<br />HA and DRS Cluster Limitations<br />High Availability (HA) Diagnostic and Reliability Improvements<br />FT Enhancements <br />vMotionEnhancements<br />Performance<br />Usability<br />Enhanced Feature Compatibility<br />VM-host Affinity (DRS)<br />DPM Enhancements<br />Data Recovery Enhancements<br />
    70. 70. DRS: more HA-awareness<br />vSphere 4.1 adds logic to prevent imbalance that may not be good from HA point of view.<br />Example<br />20 small VM and 2 very large VM.<br />2 ESXi hosts. Same workload with the above 20 collectively.<br />vSphere 4.0 may put 20 small VM on Host A and 2 very large VM on Host B.<br />From HA point of view, this may result in risks when Host A fails.<br />vSphere 4.1 will try to balance the number of VM.<br />
    71. 71. HA and DRS Cluster Improvements<br />Increased cluster limitations<br /><ul><li>Cluster limits are now unified for HA and DRS clusters
    72. 72. Increased limits for VMs/host and VMs/cluster
    73. 73. Cluster limits for HA and DRS:
    74. 74. 32 hosts/cluster
    75. 75. 320 VMs/host (regardless of # of hosts/cluster)
    76. 76. 3000 VMs/cluster
    77. 77. Note that these limits also apply to post-failover scenarios. Be sure that these limits will not be violated even after the maximum configured number of host failovers.</li></li></ul><li>HA and DRS Cluster Limit<br />5-host cluster, tolerate 1 host failure<br /><ul><li>vSphere 4.1 supports 320 VMs/host
    78. 78. Supports 320x5 VMs/cluster? NO
    79. 79. Cluster can only support 320x4 VMs</li></ul>X<br />5-host cluster, tolerate 2 host failures<br /><ul><li>Supports 320x5 VMs/cluster? NO
    80. 80. Cluster can only support 320x3 VMs</li></ul>X<br />X<br />
    81. 81. HA Diagnostic and Reliability Improvements<br />HA Healthcheck Status<br /><ul><li>HA provides an ongoing healthcheck facility to ensure that the required cluster configuration is met at all times. Deviations result in an event or alarm on the cluster.</li></ul>Improved HA-DRS interoperability during HA failover<br /><ul><li>DRS will perform vMotionto free up contiguous resources (i.e. on one host) so that HA can place a VM that needs to be restarted</li></li></ul><li>HA Diagnostic and Reliability Improvements<br />HA Operational Status<br />Displays more information about the current HA operational status, including the specific status and errors for each host in the HA cluster.<br />It shows if the host is Primary or Secondary!<br />
    82. 82. HA Operational Status<br />Just another example <br />
    83. 83. HA: Application Awareness<br />Application Monitoring can restart a VM if the heartbeats for an application it is running are not received<br />Expose APIs for 3rd party app developers<br />Application Monitoring works much the same way that VM Monitoring: <br />If the heartbeats for an application are not received for a specified time via VMware Tools, its VM is restarted.<br />ESXi 4.0<br />ESXi 4.1<br />
    84. 84. Fault Tolerance<br />
    85. 85. FT Enhancements<br />DRS<br />FT fully integrated with DRS<br /><ul><li>DRS load balances FT Primary and Secondary VMs. EVC required.</li></ul>Versioning control lifts requirement on ESX build consistency<br /><ul><li>Primary VM can run on host with a different build # as Secondary VM.</li></ul>Events for Primary VM vs. Secondary VM differentiated<br /><ul><li>Events logged/stored differently.</li></ul>FT PrimaryVM<br />FT SecondaryVM<br />Resource Pool<br />
    86. 86. No data-loss Guarantee<br />vLockStep: 1 CPU step behind<br />Primary/backup approach<br />A common approach to implementing fault-tolerant servers is the primary/backup approach. The execution of a primary server is replicated by a backup server. Given that the primary and backup servers execute identically, the backup server can take over serving client requests without any interruption or loss of state if the primary server fails<br />
    87. 87. New versioning feature<br />FT now has a version number to determine compatibility <br />Restriction to have identical ESX build # has been lifted<br />Now FT checks it’s own version number to determine compatibility<br />Future versions might be compatible with older ones, but possibly not vice-versa<br />Additional information on vSphere Client<br />FT version displayed in host summary tab<br /># of FT enabled VMs displayed there<br />For hosts prior to ESX/ESXi 4.1, this tab lists the host build number instead.<br />FT versions included in vm-support output<br />/etc/vmware/ft-vmk-version:product-version = 4.1.0build = 235786ft-version = 2.0.0<br />
    88. 88. FT logging improvements<br />FT traffic was bottlenecked to 2 Gbit/s even on 10 Gbit/s pNICs<br />Improved by implementing ZeroCopy feature for FT traffic Tx, too<br />For sending only (Tx)<br />Instead of copying from FT buffer into pNIC/socket buffer just a link to the memory holding the data is transferred<br />Driver accesses data directly- no copy needed<br />
    89. 89. FT: unsupported vSphere features<br />Snapshots. <br />Snapshots must be removed or committed before FT can be enabled on a VM. It is not possible to take snapshots of VMs on which FT is enabled.<br />Storage vMotion. <br />Cannot invoke Storage vMotion for FT VM. To migrate the storage, temporarily turn off FT, do Storage vMotion, then turn on FT. <br />Linked clones. <br />Cannot enable FT on a VM that is a linked clone, nor can you create a linked clone from an FT-enabled VM.<br />Back up. <br />Cannot back up an FT VM using VCB, vStorage API for Data Protection, VMware Data Recovery or similar backup products that require the use of a VM snapshot, as performed by ESXi. To back up VM in this manner, first disable FT, then re-enable FT after backup is done. <br />Storage array-based snapshots do not affect FT.<br />Thin Provisioning, NPIV, IPv6, etc<br />
    90. 90. FT: performance sample <br />MS Exchange 2007<br />1 core handles 2000 Heavy Online user profile<br />VM CPU utilisation is only 45%. ESX is only 8%<br />Based on previous “generation”<br />Xeon 5500, not 5600<br />vSphere 4.0, not 4.1<br />Opportunity<br />Higher uptime forcustomer emailsystem<br />
    91. 91. Integration with HA<br />Improved FT host management<br />Move host out of vCenter<br />DRS able to vMotion FT VMs<br />Warning if HA gets disabled and following operations will be disabled<br />Turn on FT<br />Enable FT<br />Power on a FT VM <br />Test failover <br />Test secondary restart<br />
    92. 92. VM-to-Host Affinity<br />
    93. 93. Background<br />Different servers in a datacenter is a common scenario<br />Differences by memory size, CPU generation or # or type of pNICs<br />Best practice up to now<br />Separate different hosts in different clusters<br />Workarounds<br />Creating affinity/ anti-affinity rules<br />Pinning a VM to a single host by disabling DRS on the VM.<br />Disadvantage<br />Too expensive as each cluster needed to have HA failover capacity<br />New feature: DRS Groups<br />Host and VM groups <br />Organize ESX hosts and VMs into groups<br />Similar memory<br />Similar usage profile<br />…<br />
    94. 94. VM-host Affinity (DRS)<br />Required rules<br />Preferential rules<br />Rule enforcement: 2 options<br /><ul><li>Required: DRS/HA will never violate the rule; event generated if violated manually. Only advised for enforcing host-based licensing of ISV apps.
    95. 95. Preferential: DRS/HA will violate the rule if necessary for failover or for maintaining availability</li></li></ul><li>Hard Rules<br />Hard Rules<br />DRS will follow the hard rules<br />With DPM hosts will get powered on to follow a rule<br />If DRS can’t follow, vCenter will display an alarm<br />Can not be overwritten by user<br />DRS will not generate any recommendations which would violate hard rules<br />DRS Groups and hard rules with HA<br />Hosts will be tagged as “incompatible” in case of “Must Not run…” so HA will take care of these rules, too<br />
    96. 96. Soft Rules<br />Soft Rules<br />DRS will follow a soft rule if possible<br />Will allow actions <br />User-initiated<br />DRS-mandatory<br />HA actions<br />Rules are applied as long as their application does not impact satisfying current VM cpu or memory demand<br />DRS will report a warning if the rule isn’t followed<br />DRS does not produce a move recommendation to follow the rule<br />Soft VM/host affinity rules are treated by DRS as "reasonable effort"<br />
    97. 97. Grouping Hosts with different capabilities<br />DRS Groups Manager<br />Defines Groups<br />VM groups<br />Host groups<br />
    98. 98. Managing ISV Licensing<br />Example<br />Customer has 4-node cluster<br />Oracle DB and Oracle BEA are charged for every hosts that can run it.<br />vSphere 4.1 introduces “hard partitioning”<br />Both DRS and HA will honour this boundary.<br />Rest of VMs<br />Oracle DB<br />DMZ VM<br />Oracle BEA<br />DMZ LAN<br />Production LAN<br />
    99. 99. Managing ISV Licensing<br />Hard partitioning<br />If a host is in a VM-host must affinity rule, they are considered compatible hosts, all the others are tagged as incompatible hosts. DRS, DPM and HA are unable to place the VMs on incompatible hosts.Due to the incompatible host designation, the mandatory VM-Host is a feature what can be (undeniably) described as hard partioning. You cannot place and run a VM on incompatible host<br />Oracle has not acknowledged this as hard partitioning.<br />Sources<br />http://frankdenneman.nl/2010/07/vm-to-hosts-affinity-rule/<br />http://www.latogalabs.com/2010/07/vsphere-41-hidden-gem-host-affinity-rules/<br />
    100. 100. Example of setting-up: Step 1<br />In this example, we are adding the “WinXPsp3” VM to the group.<br />The group name is “Desktop VMs”<br />
    101. 101. Example of setting-up: Step 2<br />Just like we can group VM, we can also group ESX<br />
    102. 102. Example of setting-up: Step 3<br />We have grouped the VMs in the cluster into 2<br />We have grouped the ESX in the cluster into 2<br />
    103. 103. Example of setting-up: Step 4<br />This is the screen where we do themapping.<br />VM Group mapped to Host Group<br />
    104. 104. Example of setting-up: Step 5<br />Mapping is done.<br />The Cluster Settings dialog box now display the new rules type.<br />
    105. 105. HA/ DRS<br />DRS lists rules<br />Switch on or off<br />Expand to display DRS Groups <br />Rule details<br />Rule policy<br />Involved Groups<br />
    106. 106.
    107. 107. Enhancement for Anti-affinity rules<br />Now more than 2 VMs in a rule<br />Each rule can have a couple of VMs<br />Keep them all together<br />Separate them through cluster<br />For each VM at least 1 host is needed<br />101<br />
    108. 108. DPM Enhancements<br />Scheduling DPM<br />Turning on/off DPM is now a scheduled task<br />DPM can be turned off prior to business hours in anticipation for higher resource demands<br />Disabling DPM<br />It brings hosts out of standby<br />Eliminates risk of ESX hosts being stuck in standby mode while DPM is disabled. <br />Ensures that when DPM is disabled, all hosts are powered on and ready to accommodate load increases. <br />
    109. 109. DPM Enhancements<br />
    110. 110. vMotion<br />
    111. 111. vMotionEnhancements<br />Significantly decreased the overall migration time (time will vary depending on workload)<br />Increased number of concurrent vMotions:<br />ESX host: 4 on a 1 Gbps network and 8 on a 10 Gbps network<br />Datastore: 128 (both VMFS and NFS)<br />Maintenance mode evacuation time is greatly decreased due to above improvements<br />
    112. 112. vMotion<br />Re-write of the previous vMotion code<br />Sends memory pages bundled together instead of one after the other<br />Less network/ TCP/IP overhead<br />Destination pre-allocates memory pages<br />Multiple senders/ receivers<br />Not only a single world responsible for each vMotion thus limit based on host CPU<br />Sends list of changed pages instead of bitmaps<br />Performance improvement<br />Throughput improved significantly for single vMotion<br />ESX 3.5 – ~1.0Gbps<br />ESX 4.0 – ~2.6Gbps<br />ESX 4.1 – max 8 Gbps<br />Elapsed reduced by 50%+ on 10GigE tests. <br />Mix of different bandwidth pNICs not supported<br />
    113. 113. vMotion<br />Aggressive Resume<br />Destination VM resumes earlier<br />Only workload memory pages have been received<br />Remaining pages transferred in background<br />Disk-Backed Operation<br />Source host creates a circular buffer file on shared storage<br />Destination opens this file and reads out of it<br />Works only on VMFS storage<br />In case of network failure during transfer vMotion falls back to disk based transfer<br />Works together with aggressive resume feature above<br />
    114. 114. Enhanced vMotion Compatibility Improvements<br />Preparation for AMD Next Generation without 3DNow!<br />Future AMD CPUs may not support 3DNow!<br />To prevent vMotion incompatibilities, a new EVC mode is introduced.<br />
    115. 115. EVC Improvements<br />Better handling of powered-on VMs<br />vCenter server now uses a live VM's CPU feature set to determine if it can be migrated into an EVC cluster<br />Previously, it relied on the host's CPU features<br />A VM could run with a different vCPU than the host it runs on<br />I.e. if it was initially started on an older ESX host and vMotioned to the current one<br />So the VM is compatible to an older CPU and could possibly be migrated to the EVC cluster even if the ESX hosts the VM runs on is not compatible<br />
    116. 116. Enhanced vMotionCompatibility Improvements<br />Usability Improvements<br />VM's EVC capability: The VMs tab for hosts and clusters now displays the EVC mode corresponding to the features used by VMs.<br />VM Summary: The Summary tab for a VM lists the EVC mode corresponding to the features used by the VM.<br />
    117. 117. EVC (3/3)<br />Earlier Add-Host Error detection<br />Host-specific incompatibilities are now displayed prior to the Add-Host work-flow when adding a host into an EVC cluster<br />Up to now this error occurred after all needed steps were done by the administrator<br />Now it’ll warn earlier<br />
    118. 118. Licencing<br />Host-Affinity, Multi-core VM, Licence Reporting Manager<br />
    119. 119. Multi-core CPU inside a VM<br />Click this<br />
    120. 120. Multi-core CPU inside a VM<br />2-core, 4-core, 8 core.<br />No 3-core, 5 core, 6 core, etc<br />Type this manually<br />
    121. 121. Multi-core CPU inside a VM<br />How to enable (per VM, not batch)<br />Turn off VM. Can not be done online.<br />Click Configuration Parameters<br />Click Add Row and type cpuid.coresPerSocket in the Name column.<br />Type a value (2, 4, or 8) in the Value column.<br />The number of virtual CPUs must be divisible by the number of cores per socket. The coresPerSocket setting must be a power of two.<br />Notes:<br />If enabled, CPU Hot Add is disabled<br />
    122. 122. Multi-core CPU inside a VM<br />Once enabled, it is not readily shown to administrator<br />This is not shown easily in the UI. <br />VM listing in vSphere Client does not show core<br />Possible to write scripts<br />Iterates per VM<br />Sample tools<br />CPU-Z<br />MS SysInternals<br />
    123. 123. Customers Can Self-Enforce Per VM License Compliance<br />When customer use more than they bought<br />Alert by vCenter<br />But will be able to continue managing additional VMs. So can over use.<br />Customers are responsible for purchasing additional licenses and any back-SNS. So Support & Subscription must be back dated. This is consistent with current vSphere pricing.<br />
    124. 124. Thank You<br />I’m sure you are tired too <br />
    125. 125. Useful references<br />http://vsphere-land.com/news/tidbits-on-the-new-vsphere-41-release.html<br />http://www.petri.co.il/virtualization.htm<br />http://www.petri.co.il/vmware-esxi4-console-secret-commands.htm<br />http://www.petri.co.il/vmware-data-recovery-backup-and-restore.htm<br />http://www.delltechcenter.com/page/VMware+Tech<br />http://www.kendrickcoleman.com/index.php?/Tech-Blog/vm-advanced-iso-free-tools-for-advanced-tasks.html<br />http://www.ntpro.nl/blog/archives/1461-Storage-Protocol-Choices-Storage-Best-Practices-for-vSphere.html<br />http://www.virtuallyghetto.com/2010/07/script-automate-vaai-configurations-in.html<br />http://searchvmware.techtarget.com/tip/0,289483,sid179_gci1516821,00.html<br />http://vmware-land.com/esxcfg-help.html<br />http://virtualizationreview.com/blogs/everyday-virtualization/2010/07/esxi-hosts-ad-integrated-security-gotcha.aspx<br />http://www.MS.com/licensing/about-licensing/client-access-license.aspx#tab=2<br />http://www.MSvolumelicensing.com/userights/ProductPage.aspx?pid=348<br />http://www.virtuallyghetto.com/2010/07/vsphere-41-is-gift-that-keeps-on-giving.html<br />
    126. 126. vSphere Guest API<br />It provides functions that management agents and other software can use to collect data about the state and performance of a VM. <br />The API provides fast access to resource management information, without the need for authentication.<br />The Guest API provides read‐only access. <br />You can read data using the API, but you cannot send control commands. To issue control commands, use the vSphere Web Services SDK.<br />Some information that you can retrieve through the API:<br />Amount of memory reserved for the VM.<br />Amount of memory being used by the VM.<br />Upper limit of memory available to the VM.<br />Number of memory shares assigned to the VM.<br />Maximum speed to which the VM’s CPU is limited.<br />Reserved rate at which the VM is allowed to execute. An idling VM might consume CPU cycles at a much lower rate.<br />Number of CPU shares assigned to the VM.<br />Elapsed time since the VM was last powered on or reset.<br />CPU time consumed by a particular VM. When combined with other measurements, you can estimate how fast the VM’s CPUs are running compared to the host CPUs<br />