How to Resolve Capacity Bottlenecks and Ensure Great Performance in Your VMware ESX Environment

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    How to Resolve Capacity Bottlenecks and Ensure Great Performance in Your VMware ESX Environment - Presentation Transcript

    1. How to Resolve Capacity Bottlenecks and Ensure Great Performance in Your VMware ESX Environment A VMware System Administrator’s Guide to Identifying Capacity Bottlenecks, Predicting Future Capacity Bottlenecks, and Removing Them A Whitepaper by: Alex Bakman Founder and CEO, VKernel Corporation abakman@vkernel.com Joan Mealey Senior Systems Engineer, VKernel Corporation jmealey@vkernel.com
    2. Table of Contents The Virtualized Data Center: A Different World ......................................3 A Quick Primer on VMware ESX Capacity Management Concepts.......3 Environment Changes that Impact Capacity ..........................................4 Finding Current Capacity Bottlenecks ....................................................6 Identifying Future Capacity Bottlenecks.................................................7 Finding Pockets of Available Capacity in Your Environment................7 Page 2 of 8
    3. The Virtualized Data Center: A Different World As we are writing this paper, the overwhelming majority of organizations are migrating and consolidating their servers from physical to virtual environments. While the savings of the virtualized data center are extremely compelling, there is now a new set of challenges that System Administrators did not have to deal with in the physical world. One of the challenges is getting used to the fact that all hardware "As data centers struggle resources: CPU, memory, storage and network utilization are with server consolidation shared between virtual machines. This means that applications and server virtualization, and users can impact each other and therefore resource capacity planning becomes monitoring becomes really important. Put another way if you don’t closely monitor resource consumption by each virtual the key to maintaining or machine and simply keep adding more virtual machines without improving service quality doing analysis on how this will impact all four core resources, the while containing costs." result will be bad performance and even system downtime. Ultimately, this means unhappy users and dissatisfied managers. So, you need to find a way to prevent this problem. The Capacity Planning Software Market In this white paper we will give you a formula for ensuring that -Forrester you won’t run into these problems. In fact if you closely monitor resource availability, you will create a very stable environment and prove to your users that virtualization is ready for prime time. A Quick Primer on VMware ESX Capacity Management Concepts Clusters consist of two or more VMware ESX hosts working together as one to provide high availability/redundancy. Clusters allow the sum of all resources from the hosts in the cluster to be spread among the virtual machines on those hosts. This is a typical way to deploy VMware ESX hosts in all but the smallest environments. Resource pools allow the administrator to allocate and divide resources among virtual machines in clusters and other resource pools. They work by using reservations (guaranteed resources), shares (for when resources are overcommitted), and limits. Resource pools can be nested and organized in a hierarchical fashion to match the company’s organization. Reservations are used for the more critical virtual machines, enabling you to guarantee a virtual machine gets a specific amount of memory or CPU time. Shares are used to divide up the remaining resources among the rest of the virtual machines – the more shares a virtual machine has, the higher percentage of resources the virtual machine can use. Resource pools can have a “fixed” amount of resource or be linked into sharing resources with other Resource pools Page 3 of 8
    4. As you can see, a physical host is no longer the “resource boundary”. Now the resource boundaries can also extend to clusters and resource pools. Distributed Resource Scheduling (DRS) is designed to balance the load across the cluster by migrating virtual machines among the hosts in the cluster. DRS can be set to manual, partially- automated, or fully automated. The mode you set determines how involved you will be in deciding those migrations. “As changes happen, you VMware High Availability (HA) allows companies to provide high need to be aware of the availability to any application running in a virtual machine. It continuously monitors and automatically restarts virtual systems impact they have on your in the event of a host failure. VMware HA does not provide zero virtual environment.” downtime and it is dependant on having enough available resources among the hosts in the cluster. For instance, if you have three hosts, any one of the hosts must have available resources to run the virtual machines from either of the other two hosts in the event of a failure. So, as a system administrator, you can just set up your clusters, turn on VMware HA, set DRS to automatic and walk away right? It’s just not that simple. As you will quickly see, this is a complex organization of resources that requires constant monitoring to continue functioning properly. Environment Changes that Impact Capacity As changes happen, you need to be aware of the impact they have on your virtual environment. Here is compiled a list of "events" that can cause you to run out of capacity resources in your VMware ESX data center, resulting in performance problems or even downtime: 1. Adding new virtual machines though uncontrolled virtual machine sprawl 2. Removing hosts from clusters possibly for maintenance 3. Enabling VMware HA in your cluster without accounting for failover 4. Changing failover capacity settings in a Cluster 5. Increasing reservations in virtual machines 6. Changing resource pool configurations 7. Powering up many virtual machines that were powered off or in maintenance 8. Natural growth rates in storage, CPU, memory and network utilization 9. Changes in workloads can result in Disk I/O bottlenecks Let’s examine these in more detail. Page 4 of 8
    5. In a VMware ESX environment, all memory, CPU, storage and network utilization resources are shared. You are now dealing with a four-dimensional capacity problem. How will you know which of these resources will you run out of first? The newly virtualized data center is experiencing constant change. We have worked with many companies that are adding hundreds of virtual machines every week. Virtual machine sprawl is quickly becoming a real issue for many organizations. Even if you are not adding hundreds of virtual machines per week, every virtual machine that you do add can tip the balance “Virtual machine sprawl is in the wrong direction and cause performance problems. A good already causing numerous way to visualize the problem is to think of virtual machines as traffic jams or bottlenecks cars and your resources as roads and highways. As you add in many data centers. more cars to the roadways, and the number of roads and lanes remains fixed, sooner or later you will cause a serious traffic jam. Therefore, it is essential to If the roads do not get expanded with additional lanes, new roads quickly identify current are not built, or the number of cars does not decrease, the result traffic jams and resolve is very predictable. them as soon as possible.” Virtual machine sprawl is already causing numerous traffic jams or bottlenecks in many data centers. Therefore, it is essential to quickly identify current traffic jams and resolve them as soon as possible. Once you can accomplish this, you can do the same thing with future bottlenecks and thus stay ahead of the problem and prevent a situation in which you will run out of capacity. You also need to take into consideration that your capacity is getting depleted not only by new virtual machines, but also by natural growth requirements in your applications. Remember the age old rule of computing which states that your programs will grow to fill all available memory, CPU and storage. Nothing has changed in that regard in the virtualized data center. You must figure what is the growth rate in each resource type, and at what percentage is your memory, CPU, storage and network utilization increasing on a weekly basis. By getting a baseline and understanding trends, you will be able to proactively monitor your data center for anomalies and identify problem areas quickly. Another factor you must take into consideration is the number of powered down virtual machines and the virtual machines currently in maintenance mode. In their current state, they are not consuming resources, but if powered up they will. Unless you have total control over user behavior, and most of us don’t, you will never know when these virtual machines get powered up. If you have a large number of virtual machines powered off and those suddenly become active, you may not have enough capacity availability. Page 5 of 8
    6. And we are not done yet. There are more changes that can impact capacity availability in your data center. VMware ESX provides many ways of organizing your resources. For example a Resource pool can be configured as fixed or expandable. When fixed, the resource pool is limited to resources explicitly assigned to it. When expandable, a resource pool can tap into its parent resource pool when it can’t satisfy the requests. If this resource pool setting is changed, your capacity availability will be impacted. Another critical setting which impacts capacity is the failover “To identify them you need option setting in a cluster if it is VMware HA enabled. By default it to clearly understand is set to 1 meaning that the cluster should be able to handle one utilization of all resources host failure and must have enough capacity on the remaining (memory, CPU, storage, hosts to handle all of the virtual machines that need to be moved out of the failed host. Clearly, if the failover occurs, the resource network, disk i/o) on all capacity availability will change drastically. Moreover, if the hosts, clusters, and setting is set higher then 1, you really need to make sure that resource pools.” adequate capacity exists As you can see there are many ways to run out of resources. So now that we know this, the question becomes what we can do to prevent this situation. Finding Current Capacity Bottlenecks The first thing you must do is to identify current capacity bottlenecks. To identify them you need to clearly understand utilization of all resources (memory, CPU, storage, network, disk i/o) on all hosts, clusters, and resource pools. Our recommendation is to focus first on resources that usually become the first bottlenecks, such as memory and storage, and then proceed to examine other resource types. Unfortunately, with most available tools today that graph resource utilization over a period of time, identifying current bottlenecks is a very time consuming task. Examine this simple example. Let’s say you have 30 VMware ESX hosts organized into 4 clusters with 10 resource pools. To find a current capacity bottlenecks, you will have to examine (30 hosts + 4 clusters + 10 Resource pools) X 4 Resource types = 176 charts. And, this is not a one time event. Given how quickly most organizations are adding new virtual machines, you need to examine the charts at least once or twice a week at a minimum. That’s a lot of work and unnecessary time to be spending on one issue. Page 6 of 8
    7. Identifying Future Capacity Bottlenecks Identifying future capacity bottlenecks is not a trivial process. Some people are fooled into thinking that as long as you correctly compute future growth requirements then you are all set. However, the problem is much more complex than that. Preventing future capacity bottlenecks requires you to perform many steps which need to be repeated often enough to avoid running into problems. We recommend that VMware “For virtualization projects administrators perform the following steps at least several times to be ongoing successes, per week. you must have control of your available capacity.” 1. Compute additional resource requirements needed to support new virtual machines before deployment and figure out which hosts, clusters, or resource pools have the necessary resources. 2. Closely monitor growth in resource utilization of every host, cluster, and resource pool. Here you must examine resource utilization at every level just as you do to figure out current capacity bottlenecks (see the formula above). You then have to compare last weeks capacity status with this week’s in order to identify the “delta” or what’s changed in the utilization of resources at every resource boundary level in your infrastructure. 3. You must impose an iron clad change control process on critical configuration settings that control resource allocation. You don’t want to magically discover that someone in your organization changed the failover settings, or changed reservations, or unilaterally decided to remove a host from a cluster for maintenance. Your change control process must extend to powered down virtual machines with a tight control on who, where, and when these virtual machines can be powered up. 4. Closely monitor workloads especially in terms of disk I/O and time shifts. As you virtualize more servers, the variability of workloads will inevitable increase. Some workloads will be I/O intensive and others will be CPU intensive. What will also happen is a time shift when the workloads will peak. It is a given. You must find a way to forecast these situations. Finding Pockets of Available Capacity in Your Environment For virtualization projects to be ongoing successes, you must have control of your available capacity. As you add new virtual machines, remember these critical questions: • How many more virtual machines can you add? Page 7 of 8
    8. • What resources will you run out of first as more virtual machines are added? • Where are your current and future capacity bottlenecks? Helping VMware administrators deal with these challenges, companies, such as VKernel are providing products that quickly solve your issues. Be sure to visit our website www.vkernel.com frequently to download our products and check for updates. If you have questions or need help, you are welcome to email us For more information or to learn directly: abakman@vkernel.com or jmealey@vkernel.com. more, call 866-370-2733 or visit www.vkernel.com. Page 8 of 8

    + Vin TurkVin Turk, 1 month ago

    custom

    410 views, 0 favs, 0 embeds more stats

    As we are writing this paper, the overwhelming majo more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 410
      • 410 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 3
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories