Virtualizing Highly Available SQL Servers
Scott Salyer, VMware
VAPP5932
#VAPP5932
22
Agenda
 Why Virtualize
 Causes of Downtime and Planning a strategy
 Scenario 1 – Baseline High Availability
 Scenar...
33
Setting Expectations
 This is NOT a Best Practices Session
• This session will cover Availability and Recovery for SQL...
44
Summary
TimetoMarket
QualityofService
Availability
Quality of Service (QoS)
 Guaranteed performance SLAs through
resou...
55
Causes of Downtime
 Planned Downtime
• Software upgrade (OS patches, SQL Server cumulative updates)
• Hardware/BIOS up...
66
Failover Clustering
 Local server redundancy
 Instance level failover
 Zero data loss
 Local server and storage red...
77
Planning a High Availability Strategy
 Requirements
• Recovery Time Objective (RTO)
• What does 99.99% availability re...
88
HardwareFailureTolerance
Application Coverage
VMware FT
Unprotected
Automated
Restart
Continuous
0% 10% 100%
VMware HA
...
99
Scenario 1 – Baseline High Availability
Moving beyond physical limitations
1010
VMware Availability Features
1111
VMware vSphere High Availability (HA)
 Protection against host or operating system failure
• Automatic restart of vi...
1212
VM Mobility
 Server Maintenance
• VMware vSphere® vMotion® and
VMware vSphere Distributed
Resource Scheduler (DRS)
M...
1313
App-Aware HA Through Health Monitoring APIs
 Leverage third-party solutions that integrate with VMware HA
(for examp...
1414
Standalone SQL Server VM with VMware HA, DRS, & vMotion
 Highlights:
• Quickly restore service after host
failure
• ...
1515
Scenario 2 –
AlwaysOn High Availability
What happens when a node fails?
1616
What are SQL Server Always On Availability Groups?
• Database-level replication over IP…, no shared storage requireme...
1717
Scenario 2 – Improving on AlwaysOn High Availability
 Technology Chosen
• AlwaysOn AG for HW and SW protection
• VMw...
1818
vSphere HA with AlwaysOn Availability Group (AG)
 Protection against HW/SW
failures and DB corruption
 Storage flex...
1919
Demo
Deploying AlwaysOn Availability Group on vSphere
2020
Deploying AlwaysOn Availability Group on vSphere
 Step 1: vSphere platform setup
• Ensure disk is created as Thick E...
2121
Deploying AlwaysOn Availability Group on vSphere – Continued
 Step 4: Create AG for AdventureWorks2012 database
• Pr...
2222
Scenario 3 –
SQL Server Failover Clustering
(Shared Disk)
2323
What is Microsoft Failover Clustering?
• Provides application high-availability through a shared-disk architecture
• ...
2424
vSphere HA with Failover Clustering
 Highlights:
• RTO in few seconds
• Protection against HW/SW failures
but not DB...
2525
VMware Support For Microsoft Clustering On vSphere
Microsoft
Clustering on
VMware
vSphere
support
VMware
HA
support
v...
2626
Scenario 4 – Rolling Upgrades
Patching without clusters
2727
Patching Non-clustered Databases
 Benefits
• No need to deploy an MS cluster
simply for patching / upgrading the
OS ...
2828
Scripted MS SQL Server Rolling Patch Upgrades
VMware PowerCLI and Powershell provide a reproducible result
What about...
2929
Use vCenter Orchestrator and vCloud Automation Center
to Enhance Rolling Patch Upgrades
 Automation Execution and St...
3030
Demo
Automated Rolling Patch Upgrade using Standby VM
3131
Rolling Patch Upgrade Using Standby VM
 Step 1: Configure Standby VM
• Create VM using SQL Server Sysprep or using O...
3232
Rolling Patch Upgrade Using Standby VM – Continued
 Step 4: Hot add resource to Standby VM
• Hot add VMDK to Standby...
3333
Disaster Recovery and Backup
3434
VMware vCenter Site Recovery Manager™ (SRM)
• Relies on storage or vSphere host replication
• Allows creation, mainte...
3535
VMware vCenter SRM with SQL Server AAG
• AAG provides local availability
• Storage replication keeps DR facility in s...
3636
In-guest SQL Server-Aware Backup Solution
• Standard method for physical or virtual
• Agent runs in the VM guest and ...
3737
Array-based Backup
• Backup vendor software coordinates with VSS to create a supported backup
image of the SQL Server...
3838
VMware
Putting It All Together
Planned downtime avoidance
• vMotion & Storage vMotion
• Rolling SQL Server upgrades w...
3939
Summary
TimetoMarket
QualityofService
Availability
Quality of Service (QoS)
 Guaranteed performance SLAs through
res...
4040
Resources
 Visit us on the web to learn more on specific apps
• http://www.vmware.com/solutions/business-critical-ap...
4141
Questions?
4242
Other VMware Activities Related to This Session
 HOL:
HOL-SDC-1304 and HOL-SDC-1317
vSphere Performance Optimization...
THANK YOU
Virtualizing Highly Available SQL Servers
Scott Salyer, VMware
VAPP5932
#VAPP5932
VMworld 2013: Virtualizing Highly Available SQL Servers
Upcoming SlideShare
Loading in...5
×

VMworld 2013: Virtualizing Highly Available SQL Servers

588

Published on

VMworld 2013

Scott Salyer, VMware

Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
588
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
48
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

VMworld 2013: Virtualizing Highly Available SQL Servers

  1. 1. Virtualizing Highly Available SQL Servers Scott Salyer, VMware VAPP5932 #VAPP5932
  2. 2. 22 Agenda  Why Virtualize  Causes of Downtime and Planning a strategy  Scenario 1 – Baseline High Availability  Scenario 2 – AlwaysOn Availability Groups  Scenario 3 – SQL Server Failover Clustering  Scenario 4 – Rolling Upgrades  Disaster Recovery and Backup  Summary
  3. 3. 33 Setting Expectations  This is NOT a Best Practices Session • This session will cover Availability and Recovery for SQL Server database VMs • This session does NOT cover performance, sizing, scaling, or consolidation…for more information on these topics, please attend VAPP1006- GD SQL/MS Apps with Jeff Szastak (a group discussion)
  4. 4. 44 Summary TimetoMarket QualityofService Availability Quality of Service (QoS)  Guaranteed performance SLAs through resource controls, dynamic load balancing, capacity & performance management  Simplified security SLAs with app protection Time to Market (TTM) Availability  Protection against app failures through high availability and fault tolerance  Simplified business continuity with automated disaster recovery & backup  Reduced app provisioning times to minutes through use of templates & intelligent policy management  Dynamic scaling of apps through scale- up/scale-out capacity on demand Complete Flexibility. Non-Stop Reliability
  5. 5. 55 Causes of Downtime  Planned Downtime • Software upgrade (OS patches, SQL Server cumulative updates) • Hardware/BIOS upgrade  Unplanned Downtime • Datacenter failure (natural disasters, fire) • Server failure (failed CPU, bad network card) • I/O subsystem failure (disk failure, controller failure) • Software/Data corruption (application bugs, OS binary corruptions) • User Error (shutdown a SQL service, dropped a table)
  6. 6. 66 Failover Clustering  Local server redundancy  Instance level failover  Zero data loss  Local server and storage redundancy  Disaster recovery  Database level failover  Zero data loss with high safety mode Database Mirroring Log Shipping  Multiple disaster recovery sites for databases  Manual failover required  App/user error recovery  New in SQL Server 2012  AlwaysOn Failover Cluster Instance with shared disk architecture, native support for multi-site cluster  AlwaysOn Availability Group with non-shared disk architecture, support for multiple secondary, readable secondary AlwaysOn SQL Server Native Availability Features
  7. 7. 77 Planning a High Availability Strategy  Requirements • Recovery Time Objective (RTO) • What does 99.99% availability really mean? • Recovery Point Objective (RPO) • Zero data lost? • HA vs. DR requirements  Evaluating a technology • What’s the cost for implementing the technology? • What’s the complexity of implementing, and managing the technology? • What’s the downtime potential? • What’s the data loss exposure? Availability % Downtime / Year Downtime / Month * Downtime / week "Two Nines" - 99% 3.65 Days 7.2 Hours 1.69 Hours "Three Nines" - 99.9% 8.76 Hours 43.2 Minutes 10.1 Minutes "Four Nines" - 99.99% 52.56 Minutes 4.32 Minutes 1.01 Minutes "Five Nines" - 99.999% 5.26 Minutes 25.9 Seconds 6.06 Seconds * Using a 30 day month
  8. 8. 88 HardwareFailureTolerance Application Coverage VMware FT Unprotected Automated Restart Continuous 0% 10% 100% VMware HA VMotion (Planned Downtime) DB Mirroring / RAC / AAG Microsoft Clustering / Data Guard / AAG High Availability Options  Clustering too complex and expensive for most applications  VMware HA and FT provide simple, cost-effective availability  VMotion provides continuous availability against planned downtime
  9. 9. 99 Scenario 1 – Baseline High Availability Moving beyond physical limitations
  10. 10. 1010 VMware Availability Features
  11. 11. 1111 VMware vSphere High Availability (HA)  Protection against host or operating system failure • Automatic restart of virtual machines on any available host in cluster • Provides simple and reliable first line of defense for all databases • Minutes to restart • OS and application independent, does not require complex configuration or expensive licenses
  12. 12. 1212 VM Mobility  Server Maintenance • VMware vSphere® vMotion® and VMware vSphere Distributed Resource Scheduler (DRS) Maintenance Mode • Migrate running VMs to other servers in the pool • Automatically distribute workloads for optimal performance  Storage Maintenance • VMware vSphere® Storage vMotion • Migrate VM disks to other storage targets without disruption Key Benefits • Eliminate downtime for common maintenance • No application or end user impact • Freedom to perform maintenance whenever desired
  13. 13. 1313 App-Aware HA Through Health Monitoring APIs  Leverage third-party solutions that integrate with VMware HA (for example, Symantec ApplicationHA) OS APP OS APP Database Health Monitoring • Detect database service failures inside VMVMware HA 1 Database Service Restart Inside VM • App start / stop / restart inside VM • Automatic restart when app problem detected 2 Integration with VMware HA • VMware HA automatically initiated when • App restart fails inside VM • Heartbeat from VM fails 3 App Restart 1 2 3
  14. 14. 1414 Standalone SQL Server VM with VMware HA, DRS, & vMotion  Highlights: • Quickly restore service after host failure • Simple to configure and easy to manage • Can use Standard Windows and SQL Server editions  Note : • Protection against hardware failures only • Does not provide application-level protection
  15. 15. 1515 Scenario 2 – AlwaysOn High Availability What happens when a node fails?
  16. 16. 1616 What are SQL Server Always On Availability Groups? • Database-level replication over IP…, no shared storage requirement • Same advantages as failover clustering (service availability, patching, etc.) • Two copies of the data…, protection from data corruption • Readable secondary • Automatic or manual failover through WSFC policies
  17. 17. 1717 Scenario 2 – Improving on AlwaysOn High Availability  Technology Chosen • AlwaysOn AG for HW and SW protection • VMware HA & vMotion for added protection • SRM for DR, SRM integration to restore AG on remote site  Benefits • Quickly restart failed AAG node to bring cluster back to full capabilities • Migrate nodes off physical hardware (hosts or storage) without downtime or impact • Automate Disaster Recovery at remote site with SRM
  18. 18. 1818 vSphere HA with AlwaysOn Availability Group (AG)  Protection against HW/SW failures and DB corruption  Storage flexibility (FC, iSCSI, NFS)  Compatible w/ vMotion, DRS, HA  RTO in few seconds  vSphere HA + AlwaysOn AG • Seamless integration, VMs rejoins AG after vSphere HA recovery • Can shorten time that database is in unprotected state • Reduces synchronization time after VM recovery
  19. 19. 1919 Demo Deploying AlwaysOn Availability Group on vSphere
  20. 20. 2020 Deploying AlwaysOn Availability Group on vSphere  Step 1: vSphere platform setup • Ensure disk is created as Thick Eager Zeroed • Create DRS anti-affinity to avoid running VMs on the same host  Step 2: Create WSFC • Install Failover Clustering feature • Create a cluster for the Availability Group • Add SQL Server VMs as cluster nodes • Configure quorum policy to use “Node and File Share majority”  Step 3: Enable SQL Server for AlwaysOn • Configure SQL Server service to enable AlwaysOn High Availability Groups on each SQL Instances • Restart SQL service
  21. 21. 2121 Deploying AlwaysOn Availability Group on vSphere – Continued  Step 4: Create AG for AdventureWorks2012 database • Prerequisite: Set database to use full recovery mode • Prerequisite: Take a full backup of the database • Create a 2 node AG with synchronous commit, automatic failover • Create a Database Listener for the AG  Step 5: Monitor AG from Dashboard • Dashboard shows the heath state of the AG, and status of each replica
  22. 22. 2222 Scenario 3 – SQL Server Failover Clustering (Shared Disk)
  23. 23. 2323 What is Microsoft Failover Clustering? • Provides application high-availability through a shared-disk architecture • One copy of the data, rely on storage technology to provide data redundancy • Automatic failover for any application or user • Suffers from restrictions in storage and VMware configuration
  24. 24. 2424 vSphere HA with Failover Clustering  Highlights: • RTO in few seconds • Protection against HW/SW failures but not DB corruption • Legacy application support (those not mirror-aware)  Note: • DRS and vMotion not available (only cold migration) • No protection from data corruption or storage failures • Storage must be FC • Must use RDMs
  25. 25. 2525 VMware Support For Microsoft Clustering On vSphere Microsoft Clustering on VMware vSphere support VMware HA support vMotion DRS support Storage vMotion support MSCS Node Limits Storage Protocols support Shared Disk FC In- Guest OS iSCSI Native iSCSI In- Guest OS SMB FCoE RDM VMFS Shared Disk MSCS with Shared Disk Yes Yes1 No No 2 5 (5.1 only) Yes Yes No Yes5 Yes4 Yes2 Yes3 Exchange Single Copy Cluster Yes Yes1 No No 2 5 (5.1 only) Yes Yes No Yes5 Yes4 Yes2 Yes3 SQL Clustering Yes Yes1 No No 2 5 (5.1 only) Yes Yes No Yes5 Yes4 Yes2 Yes3 SQL AlwaysOn Failover Cluster Instance Yes Yes1 No No 2 5 (5.1 only) Yes Yes No Yes5 Yes4 Yes2 Yes3 Non shared Disk Network Load Balance Yes Yes1 Yes Yes Same as OS/app Yes Yes Yes N/A Yes N/A N/A Exchange CCR Yes Yes1 Yes Yes Same as OS/app Yes Yes Yes N/A Yes N/A N/A Exchange DAG Yes Yes1 Yes Yes Same as OS/app Yes Yes Yes N/A Yes N/A N/A SQL AlwaysOn Availability Group Yes Yes1 Yes Yes Same as OS/app Yes Yes Yes N/A Yes N/A N/A Shared Disk Configurations: Supported on vSphere with additional considerations for storage protocols and disk configs Non-Shared Disk Configurations: Supported on vSphere just like on physical * Use affinity/anti-affinity rules when using vSphere HA ** RDMs required in “Cluster-across-Box” (CAB) configurations, VMFS required in “Cluster-in-Box” (CIB) configurations VMware Knowledge Base Article: http://kb.vmware.com/kb/1037959
  26. 26. 2626 Scenario 4 – Rolling Upgrades Patching without clusters
  27. 27. 2727 Patching Non-clustered Databases  Benefits • No need to deploy an MS cluster simply for patching / upgrading the OS and database • Ability to test in a controlled manner (multiple times if needed) • Minimal impact to production site until OS patching completed and tested • Patching of secondary VM can occur during regular business hours  Requires you to layout VMDKs correctly to support this scenario
  28. 28. 2828 Scripted MS SQL Server Rolling Patch Upgrades VMware PowerCLI and Powershell provide a reproducible result What about…  Audit trail / log of execution?  Which roles participate in managing upgrade and how? VMware ESX VMware ESXi
  29. 29. 2929 Use vCenter Orchestrator and vCloud Automation Center to Enhance Rolling Patch Upgrades  Automation Execution and Status • Workflows provide a powerful means for process flow and control • Creates a standard definition of infrastructure processes • Execution status available in realtime  Integrates with Scripting and Systems • Managed Powershell execution  Self Service • Self Service Portal • Initiated by assigned user Roles • Delegated Approvals
  30. 30. 3030 Demo Automated Rolling Patch Upgrade using Standby VM
  31. 31. 3131 Rolling Patch Upgrade Using Standby VM  Step 1: Configure Standby VM • Create VM using SQL Server Sysprep or using OS only clone + SQL install • Apply any server level configurations changes • Patch Standby VM to the target service pack level • Start client app (for demo purpose only)  Step 2: Remove Primary VM from public network • Disconnect public nic • Observe: client is experiencing temporary connection down, and in a loop to reconnect  Step 3: Hot remove resource from Primary VM • Detach database from SQL Server instance using a script • Take disk offline • Hot remove VMDK from VM
  32. 32. 3232 Rolling Patch Upgrade Using Standby VM – Continued  Step 4: Hot add resource to Standby VM • Hot add VMDK to Standby VM • Bring disk online • Attach database to SQL Server instance  Step 5: Perform final role switch • Configure Standby VM to take the IP address of the Primary VM public nic. Standby is now the new primary. • Observe: client is automatically reconnected to the new primary with update service pack  The old Primary VM can be taken down for application of service patch  See blog post on: http://blogs.vmware.com/apps/2011/11/sql- server-rolling-patch-upgrade-using-standby-vm.html
  33. 33. 3333 Disaster Recovery and Backup
  34. 34. 3434 VMware vCenter Site Recovery Manager™ (SRM) • Relies on storage or vSphere host replication • Allows creation, maintenance, and execution of automated process to facilitate site recovery • Safe testing without impacting production environment • Self-documenting
  35. 35. 3535 VMware vCenter SRM with SQL Server AAG • AAG provides local availability • Storage replication keeps DR facility in sync • During a site failure, the admin has full control of recovery • After workflow is initiated, SRM automates the recovery process • The entire process can be tested without actually failing over services!
  36. 36. 3636 In-guest SQL Server-Aware Backup Solution • Standard method for physical or virtual • Agent runs in the VM guest and handles database quiescing • Data is sent over the IP network • Can affects CPU utilization in the guest OS
  37. 37. 3737 Array-based Backup • Backup vendor software coordinates with VSS to create a supported backup image of the SQL Server databases • Snap-shotted databases can later be streamed to tape as flat files with no IO impact to the production SQL Server
  38. 38. 3838 VMware Putting It All Together Planned downtime avoidance • vMotion & Storage vMotion • Rolling SQL Server upgrades with vCO / vCAC Un-Planned downtime recovery • vSphere HA + AppAware HA • vSphere FT Disaster recovery • Site Recovery Manager SQL Server 2012 • AlwaysOn Availability Groups Pre-SQL Server 2012 • Failover Clustering • Database Mirroring • Log Shipping • Replication
  39. 39. 3939 Summary TimetoMarket QualityofService Availability Quality of Service (QoS)  Guaranteed performance SLAs through resource controls, dynamic load balancing, capacity & performance management  Simplified security SLAs with app protection Time to Market (TTM) Availability  Protection against app failures through high availability and fault tolerance  Simplified business continuity with automated disaster recovery & backup  Reduced app provisioning times to minutes through use of templates & intelligent policy management  Dynamic scaling of apps through scale- up/scale-out capacity on demand Complete Flexibility. Non-Stop Reliability
  40. 40. 4040 Resources  Visit us on the web to learn more on specific apps • http://www.vmware.com/solutions/business-critical-apps/  Visit our Business Critical Application blog • http://blogs.vmware.com/apps/ …and please attend our sessions listed below for more detailed information on virtualizing and managing Tier 1 Apps on VMware!  VAPP5473 – Automated Management of Tier-1 Applications on VMware  VAPP5613 – Successfully Virtualize Microsoft Exchange Server  VAPP5932 – Virtualizing Highly Available SQL Servers  VAPP6124 – Automating VMware Cloud and Virtualization Deployments with Dell Active Infrastructure  VAPP5618 – Virtualize Active Directory, the Right Way!  VAPP4906 – Architecting Oracle Databases on vSphere 5 with NetApp Storage  VAPP5834 – Virtualizing Mission Critical Oracle RAC with vSphere and vCOPS  BCO4905 – Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager  VAPP4813 Real-world Design Examples for Virtualized SAP Environments  VCM4891 Performance Management of Business Critical Applications using vCenter Operations Management
  41. 41. 4141 Questions?
  42. 42. 4242 Other VMware Activities Related to This Session  HOL: HOL-SDC-1304 and HOL-SDC-1317 vSphere Performance Optimization vCloud Suite Use Cases - Business Critical Applications  Group Discussions: VAPP1006-GD SQL/MS Apps with Jeff Szastak
  43. 43. THANK YOU
  44. 44. Virtualizing Highly Available SQL Servers Scott Salyer, VMware VAPP5932 #VAPP5932
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×