Joseph Dantoni presentation on Virtualization from SQL Saturday DC.
Virtualization for DBAs
November 5, 2011
Senior SQL Server DBA at Comcast
2 | Footer Goes Here
Costs and Benefits
Optimizing SQL for a Virtual Environment
Major Virtualization Players
It seemed like a good idea at the time…
Server Room Sprawl
Power and Cooling Issues in DCs
Broader availability of SAN storage
Guest—The virtual server running
underneath the physical host and hypervisor
(instance of an Operating System)
Host—The physical server that your virtual
machines run on
Hypervisor—The underlying software that
performs the load balancing and sharing of
resources between guest operating systems
Thin Provisioning—Allowance in virtual
environments to overallocate physical
resources (more to come later)
Deduplication—Process of compressing
memory/disk space by saving only one copy
of common bits
moves guest OS’s from host with high
resource utilization to lower. Also an HA
function with the hypervisor
Snapshot—A full point in time backup of your
guest OS (very handy for
Cloning—The process of building a gold
guest image in order to rapid deployment
VMWare isn’t cheap
Licensing about $25k per server for a Enterprise
plus on a decent sized server
Licensing changed from CPU—to CPU with
memory grants, allowed 96 GB per CPU license
Included with your Windows Server Licenses
(amount of VMs vary based on edition)
SCOM, while not required is recommended
Benefits of Virtualization
Lower cooling and power
Higher utilization of hardware
Can be used for HA configurations
Rapid Deployment of new environments
Use Gold Standard servers and rollout
How this works…
One Physical Server
Guest Guest Guest
What does the hypervisor do?
Manages resources between guest O/S
Failover and DR
HA and DR
Virtualization hosts are the typical servers
you might run SQL Server on.
2 x 4-6 core processors (Dual socket servers
represent 80% of install base)
A Lot of RAM
Allows over allocation
allows for easy
management of this
along with SAN
NOT GOOD FOR
Shared Environment vs Dedicated
This can make monitoring and baselining
your server more challenging
You will want to have open communications
with your VM administrators
Ask for view access into VCenter—it will
show you what else is going on in the
Can be over allocated
Use servers with the newest chips—they are
optimized for Virtual Workloads
Maintain 1:1 ratio of physical cores to vCPU
for production boxes
For production workloads you may want to
dedicate CPUs to the machine
Memory can be
over allocated (but
don’t do it for
it by de-duplicating
Host Page Files
Two choices of file types—VMFS (VMWare
File System) and RDM (Raw Device
Performance between two is similar
RDM is required for clustering
VMFS generally more flexible
Use Shared Storage (SAN) to get HA and DR
Partition alignments still matters < Windows
Work with storage team to monitor I/O—
Hypervisors can have strange I/O patterns
Virtualizing SQL Server
Use Trace Flag –T834—large pages enabled
Set min and max memory—this will lock
SQL’s memory to prevent possible balloon
Follow the same storage best practices you
would for a physical box (Separate TempDB,
Test out I/O performance before beginning
Monitoring SQL Server
From the server perspective everything stays
Everything may not match at times
Ask for access to the vSphere client!
It’s the only way to have an overview into the
Troubleshoot as you normally would, then
Similarly with a SAN—try to identify what you
apps are sharing your resources
Can adjust load on the fly by using vMotion
(or Live Migration)
Virtualization is the future, and the future is
Virtual servers work from a shared resource
pool and that can impact your workloads
Identify changes you need to make to your
SQL Servers for Virtual Environments
Get access to your virtualization
Blog (slides): joedantoni.wordpress.com
This is what we’re going to discuss today—the major players in Virtualization (warning—this talk will be VMWare centric). We’ll go through all of the terminology involved in virtualization, then discuss some of the costs and benefits. Next we’ll take a dive into the technology underlying virtualization. Lastly we will talk about what you need to consider when running SQL Server on a VM. My goal for this session is for you to have a good understanding of Virtualization and how it works.
Typically the push for Virtualization comes from your data centre and Windows folks, who hear it from their management, who wants to get higher server and CPU utilization out of their hardware investment. These are the major players in the space—VMWare is by far the market leader, and I’ll do most of my talking about them today. Hyper V is catching up, but is still pretty far behind on the feature set. Most of the terminology in this presentation will be VMWare centric—it’s the market leader, and it’s what I have the most experience with. If you just want to play with Virtualization, Virtual Box is a really good solution, it’s free and easy to play with.
So in the 2000s as we were starting to try and consolidate SQL servers, something else happened, our application server environments started to sprawl tremendously. Each new project needed a new app server for each of its environments. And we ended up with server rooms that look like bad subdivisions.
So what did this lead to? More servers needed more power, and servers got more power hungry in recent years. Most servers now come with 2 750w power supplies—that’s a lot of juice. It also cranks out a lot of heat—this might not be an issue in a small environment, but as you start to reach capacity in any data center, it’s hard to add more a/c and more power capacity after the fact.The other thing that lead us to virtualization was that servers have gotten WAY more powerful—we can have 16 cores in a standard two-socket server that we get for under $10,000. Additionally, having a SAN has become way more common in the last decade. You don’t NEED a SAN for virtualization, but many of the advanced features take advantage of shared storage.
The next few slides are about the terminology we use in virtualization. I’ll slow down for a minute to let you take notes.
So what does this cost you? There has been some controversy in this space in the last few months, as VMWare changed it’s pricing model. It used to be purely CPU based, and the change was to take it in the direction of using memory+cpu based licensing. It’s a more confusing model, but it hasn’t seem to have hurt VMWare sales much.Hyper-V is included in the cost of your Windows licensing. So it’s basically free, but it’s feature set isn’t as robust as VMWare’s, and has less multi-platform support (it doesn’t support RedHat)
So here is what you gain by going with a virtualization solution. You get higher density of servers—so you get benefits there. Your power and cooling is reduce for that. As I’ll show in a few slides this can be used for HA solutions and even DR—you have some level of hardware protection.You can have a new server built in less than 10 minutes—VMWare allows you to create a template image and rename it to create a server. For example you could have a Windows Server with SQL installed with all of your best practices implemented, and it will spin up while you are getting coffee.Additionally, it’s a great place to park an app that requires a legacy version of Windows or SQL Server.O/S snapshots are also a wonderful thing—more on that later.
This is a pretty simplified example of a server, running 3 VMs. Host, will always refer to the physical box you are running on—the Hypervisor is the software that allows guest creation, and manages host resources amongst the guest servers.In this case we are running 3 guest O/Ss. They could be running any version of Windows, Linux, mix or match, it doesn’t matter.
So the hypervisor does a number of things—the guest O/S’s are operating out of a shared resource pool, so it allocates CPU and memory resources between them. In the case of VMWare, it may move a guest from a high utilization host to a low utilization host. This process is basically seamless (there are a few seconds of degraded performance). It can also manage failover and DR, I’ll go into details on that in a few slides. Backups can also be managed through the VMWare layer—this is the most efficient way to backup VMs.
This is a better picture of what a VMWare architecture looks like—the density here could be pushed more. vMotion is the process of moving VMs between servers. DRS is VMWares multi-site DR solution, which requires SAN replication, and if any of you were in my last session, you’ll know that it’s expensive and complicated to set up, but pretty cool.
Snapshots are a wonderful technology that’s built into the virtualization solution. In a nutshell it’s a way of rapidly doing a full backup of your systems. The hypervisor takes a block level snapshot of the guest O/S and tracks the changes in a snapshot file. This is great for patching, Cus, code deployments, SQL upgrades, and whatever.The one caveat to this is that you need to delete your snapshots as soon as you are done needing them. As they are capturing the delta of your server, they can grow very big very quickly.
This is a picture of how VMWare’s DR scenario works—it’s relatively straightforward. If you lose a server VMWarevmotions the VMs to another host. This is seemless to the VM and your app. Also, just in case you wondering you can set up clustering within VMWare.
So the hardware on theseVMWare boxes are pretty similar to what you run your SQL Servers on—standard 2 socket x86 servers, with a lot of RAM. CPU isn’t the biggest deal, though it is important to be on the newer processing lines which have much better processor support for virtualization.
Thin provisioning mostly applies in the storage world (your SAN folks can do it as well). In a nutshell, if you ask for 100Gb drive thin provisioning may only give out 10 Gb, until you need it. And much like expanding a data file in SQL Server is an expensive task, expanding a VMWare disk, is an expensive proposition. I think this is fine for non-production environments, but if you are doing production work, make sure you are fully provisioned up front.
In a virtual environment by definition, you are in a multi-tenant environment. It’s like living in that big building in that last slide. So you don’t want to have noisy neighbors. The big consideration here is that this can make your baselining and monitoring processes more challenging than just a straight physical server. This is a little bit political, especially in big organizations. Ask your VMWare admins, about what servers you are sharing your environment (and storage) with. As your environment grows, you may want to have a dedicated cluster of VMWare for your database servers.Also, and I will bring this up again you will want to have visibility into vCenter—this way you can see what’s going on in your entire VMWare infrastructure.
CPUs may be overallocated—what this means is that you can grant more CPUs to VMs than are physically available to the box—this can be a bad idea under high loads. For development it’s less of an issue. Another interesting thing to notes is that VMWare gets slightly less efficient as you add more CPUs to servers—their ideal number is 4. Basically each processor you add becomes progressively less powerful when compared to a physical processor.Like I mentioned earlier, you want to use processors that are optimized for virtual workloads—any new server will be, but keep this in mind if you are deploying to older hardware.And lastly on production VMs, maintain a 1:1 of vCPUs to Physical Cores. You can also reserve the CPUs in VMWare so that they aren’t allocated to other VMs.
So through the magic of VMWare we can over allocate memory to a servers. Much like CPU what that means is we can allocate more memory than is physically available. The hypervisor handles it behind the scenes, by deduplicating blocks in memory.NEVER DO THIS IN PRODUCTION!!!!If your environment comes under memory pressure the host server will begin paging and performance will degrade very rapidly.
Ah, balloons we all loved them as children, but as we know they can be very dangerous—this gentleman’s house was stolen by a bunch of balloons.
What the balloon driver does is reclaims memory from guests that VMWare thinks isn’t using it. If your SQL Server wants to use this memory, that is a very very bad thing.To prevent this from happening, use the lock pages in memory option in SQL, and for production VMs have your VMWare admin set a memory reservation for your server. Additionally, I recommend setting the min and max memory in SQL to the same value—that’s not a guarantee though.
One of a the caveats with raw device mapping is that your are limited to 256 RDMs in a given VM cluster.
Partitions alignment still matters if you running an OS that is below Windows 2008. VMWare tends to do a lot of random I/O when you think it should be sequential. So pay attention to I/O related perfmon counters