SQL Server 2012
High Availability
and DR
Philadelphia SQL Server
User’s Group
07-December-2011
About Me
 @jdanton  on Twitter
 Joedantoni.wordpress.com
 Videos and Blogs at SSWUG.org
 Vice President of the Philadelphia SQL
  Server UG
Speaker Rate
 This
     is a new talk for me—I’d love
  feedback


http://spkr8.com/s/19509
Agenda
 SQL  Server 2008 to 2012—What’s
  Changed in HA and DR
 Geo-Clustering
 All about Availability Groups
Learning Objectives
 SQLServer HA and DR
 What involved in SQL Clustering
 How it works
 What’s new in 2012 HA/DR


 This
     presentation is geared towards
  DBAs—so feel free to stop at any time with
  questions
Licensing (What’s changed)
 The  Availability Group features will require
  the Enterprise Edition of SQL Server
 The licensing model for SQL Enterprise
  Edition has changed. Consult your friendly
  Microsoft sales representative for more
  details
 Mirroring is listed as being deprecated
  from Standard Edition. Will still be there in
  2012
Windows Core Support
 No  GUI version of Windows
 Allows for fewer patches
 Uses PowerShell and MMCs for support
SQL Server 2012
 Extended   Events are used much more
  heavily
 Slipstream Install no longer required—SQL
  will check for updates from your Windows
  Update source
    Can use internet Windows Update or
     internal source
High Availability and DR
Options in SQL 2008


 SQL Server Clustering
 SQL Server Mirroring
 Peer to Peer Replication
 SQL Server Log Shipping*
HA and DR Options in SQL
Server 2012
 Backup   and Recovery
 Mirroring
 Availability Groups (2012)
 Log Shipping
 Replication
 SAN Replication*
 Virtualization*
Clustering--2008
 SQLClustering required 1 subnet to be
 used across the whole cluster

     10.10.100.10
 Cluster failover is controlled by
 isAlive/looksAlive processes, which check
 the SQL service and run @@servername
What’s new in SQL Server 2012
HA/DR
 New    Term: Witness Disk
     Can be physical (SAN) disk, or cluster file
      share
 Multi-subnet    clustering is supported
     Requires SAN replication
 FlexibleFailover
 The BIG one—Always On Availability
  Groups
Clustering 2012
 Fullsupport for geo-distributed clusters
 Flexible failover model
 TempDB on Non-shared Disk Resource
     Makes PCI-based Solid State Drive an
      option
     No check for this as of CTP3—instance
      won’t start if TempDB drive location not
      available
Geo-Distributed Clustering
 Requires SAN replication ($$$$)
 Two of everything
 Requires really fast network connection
 Requires some trickery at the
  network/DNS level for connectivity
Multi-Subnet Cluster
Flexible Failover
 Replaces looksAlive/isAlive functionality in
  SQL Clusters (and is used for Availability
  Groups)
 Now runs sp_server_diagnostics
     Two new parameters
       HealthCheckTimeout   (Default 60
        sec/Minimum 15 sec)
       Failover Condition Level
Flexible Failover Policies for
 Clusters
Level         Condition                Description

              No automatic failover •        Indicates that no failover or restart will be triggered
0
              or restart                     automatically on any failure conditions.

              Failover or restart on
1                                      •     SQL Server service is down.
              server down

                                       •     SQL Server instance is not responsive (Resource DLL cannot
              Failover or restart on
2                                            receive data from sp_server_diagnostics within the
              server unresponsive
                                             HealthCheckTimeout settings).


              Failover or restart on   •     System stored procedure sp_server_diagnostics returns
3 (Default)
              critical server errors         ‘system error’. (Critical errors > 20)


              Failover or restart on •       System stored procedure sp_server_diagnostics returns
4
              moderate server errors         ‘resource error’. (Moderate errors > 17)


              Failover or restart on
                                       •     System stored procedure sp_server_diagnostics returns
5             any qualified failure
                                             ‘query_processing error’. (Deadlock)
              conditions
Understanding Quorum
 There  are a few slides on this topic, it’s a
 little confusing
    In a nutshell, you cluster has to be able to
     talk to itself to keep the cluster service up in
     running
    This applies to both SQL Server Failover
     Cluster Instances and AlwaysOn Availability
     Groups
Quorum
 Quorum     is critical—contains master copy
  of the cluster’s configuration
 Serves as a tiebreaker if network
  communications between cluster nodes
  fail
 If Quorum fails—cluster is shut down until
  it’s restored
Quorum Models
 Node and Disk Majority (Default)
 Node Majority
 No Majority (Quorum Disk Only)
 Node and File Share Majority (Good for
  Geo Clusters)
Quorum Failure Tolerance
Number of Nodes                   2     3    4    5    6    7
Node Majority                     0     1    1    2    2    3
Node and Disk/File Share Majority 1     2    2    3    3    4

• Assuming Disk is Up Calculation is: Cluster Up = RoundUp(Total
  # of Nodes/2)
• Assuming Disk is Down Calculation is: ClusterUp = RoundUp
  (Total # of Nodes/2)-1
DR in SQL 2008
 Mirroring
     Allowed automatic failover, but only one
      target
     Mirror target is unreadable
 Log   Shipping
     Allowed multiple targets, but failover a
      manual process, requiring a connection
      string change
 Replication
AlwaysOn Availability Groups
AlwaysOn Requirements
 Windows  Enterprise (Clustering is a
  requirement)
 SQL Server Enterprise Edition
 Windows Cluster
 No shared storage is required
 Quorum Disk Preferred
Flexible AG Failover
 Similar to how a failover clustered
  instance fails over
 Connects to instance every 30 seconds to
  perform health check
 Also, similar quorum model to Windows
  Failover Clustering
Allows for SAN Less HA/DR
 Thisisn’t a huge thing for SQL Server at big
  shops
 It may allow us to incorporate a level of
  DR into a virtual environment
Failover Modes
 Automatic  failover
 Planned manual failover (without data
  loss)
 Forced manual failover (with possible
  data loss)
Failover
                                   Synchronous-    Synchronous-
                   Asynchronous-   commit mode     commit mode
                   commit mode     with manual-    with automatic-
                                   failover mode   failover mode

 Automatic
                   No              No              Yes
 failover
 Manual failover No                Yes             Yes

 Forced failover   Yes             Yes             No
Client Connections in This
Model
 Availability
             Group Listener (Yes, SQL Server
  now has a listener)
     Works just like a failover clustering instance
      (single instance, single IP)
     Creates a VCO (AD Virtual Computer
      Object)
Contained Databases
 Isolate   Database from Instance
     Currently only fully supported with SQL
      Logins
 No  numbered procedures
 Eases database movement
 Allows for ease of migration to Azure


 Not   quite baked out as of RC0
Read Only Replicas
 Can have up to 3
 SQL Client 2012 will allow for this routing
  specifically
 Can take backups from read-only copys*
     Copy Only Backups (only full copy, does
      not affect primary log)
 Indexingmust be same on replicas
 Bad queries can affect status of replica
Considerations for Availability
Groups
   All SQL servers (including the secondary in the DR site) in the
    same Windows domain
   All the databases must be in FULL recovery model
   The unit of failover (for local HA, as well as DR) is at the AG
    level, i.e., group of databases – not the instance
       Consider using Contained Database for containing logins for failover
       For jobs and other objects outside the database, simple
        customization needed
   No delayed apply on the secondary
   Removing log shipping means the regular log backup job is
    removed
         Need to re-establish periodic log backup (essential for truncating
          the log)
    New tools for monitoring and alerting
         AlwaysOn Dashboard
         System Center Operations Manager
Availability Groups
 Demo
Summary
 Lotsof Change in the HA/DR Space
 Licensing also changes—talk to your MS
  rep
 SQL Server Failover Clusters still a good HA
  option
 AlwaysOn Availability Groups add a lot
  more flexibility to DR
Contact Info
 @jdanton
 jdanton1@yahoo.com




http://spkr8.com/s/19509

Sql server 2012 ha dr

  • 1.
    SQL Server 2012 HighAvailability and DR Philadelphia SQL Server User’s Group 07-December-2011
  • 2.
    About Me  @jdanton on Twitter  Joedantoni.wordpress.com  Videos and Blogs at SSWUG.org  Vice President of the Philadelphia SQL Server UG
  • 3.
    Speaker Rate  This is a new talk for me—I’d love feedback http://spkr8.com/s/19509
  • 4.
    Agenda  SQL Server 2008 to 2012—What’s Changed in HA and DR  Geo-Clustering  All about Availability Groups
  • 5.
    Learning Objectives  SQLServerHA and DR  What involved in SQL Clustering  How it works  What’s new in 2012 HA/DR  This presentation is geared towards DBAs—so feel free to stop at any time with questions
  • 6.
    Licensing (What’s changed) The Availability Group features will require the Enterprise Edition of SQL Server  The licensing model for SQL Enterprise Edition has changed. Consult your friendly Microsoft sales representative for more details  Mirroring is listed as being deprecated from Standard Edition. Will still be there in 2012
  • 7.
    Windows Core Support No GUI version of Windows  Allows for fewer patches  Uses PowerShell and MMCs for support
  • 8.
    SQL Server 2012 Extended Events are used much more heavily  Slipstream Install no longer required—SQL will check for updates from your Windows Update source  Can use internet Windows Update or internal source
  • 9.
    High Availability andDR Options in SQL 2008  SQL Server Clustering  SQL Server Mirroring  Peer to Peer Replication  SQL Server Log Shipping*
  • 10.
    HA and DROptions in SQL Server 2012  Backup and Recovery  Mirroring  Availability Groups (2012)  Log Shipping  Replication  SAN Replication*  Virtualization*
  • 11.
    Clustering--2008  SQLClustering required1 subnet to be used across the whole cluster  10.10.100.10  Cluster failover is controlled by isAlive/looksAlive processes, which check the SQL service and run @@servername
  • 12.
    What’s new inSQL Server 2012 HA/DR  New Term: Witness Disk  Can be physical (SAN) disk, or cluster file share  Multi-subnet clustering is supported  Requires SAN replication  FlexibleFailover  The BIG one—Always On Availability Groups
  • 13.
    Clustering 2012  Fullsupportfor geo-distributed clusters  Flexible failover model  TempDB on Non-shared Disk Resource  Makes PCI-based Solid State Drive an option  No check for this as of CTP3—instance won’t start if TempDB drive location not available
  • 14.
    Geo-Distributed Clustering  RequiresSAN replication ($$$$)  Two of everything  Requires really fast network connection  Requires some trickery at the network/DNS level for connectivity
  • 15.
  • 16.
    Flexible Failover  ReplaceslooksAlive/isAlive functionality in SQL Clusters (and is used for Availability Groups)  Now runs sp_server_diagnostics  Two new parameters  HealthCheckTimeout (Default 60 sec/Minimum 15 sec)  Failover Condition Level
  • 17.
    Flexible Failover Policiesfor Clusters Level Condition Description No automatic failover • Indicates that no failover or restart will be triggered 0 or restart automatically on any failure conditions. Failover or restart on 1 • SQL Server service is down. server down • SQL Server instance is not responsive (Resource DLL cannot Failover or restart on 2 receive data from sp_server_diagnostics within the server unresponsive HealthCheckTimeout settings). Failover or restart on • System stored procedure sp_server_diagnostics returns 3 (Default) critical server errors ‘system error’. (Critical errors > 20) Failover or restart on • System stored procedure sp_server_diagnostics returns 4 moderate server errors ‘resource error’. (Moderate errors > 17) Failover or restart on • System stored procedure sp_server_diagnostics returns 5 any qualified failure ‘query_processing error’. (Deadlock) conditions
  • 18.
    Understanding Quorum  There are a few slides on this topic, it’s a little confusing  In a nutshell, you cluster has to be able to talk to itself to keep the cluster service up in running  This applies to both SQL Server Failover Cluster Instances and AlwaysOn Availability Groups
  • 19.
    Quorum  Quorum is critical—contains master copy of the cluster’s configuration  Serves as a tiebreaker if network communications between cluster nodes fail  If Quorum fails—cluster is shut down until it’s restored
  • 20.
    Quorum Models  Nodeand Disk Majority (Default)  Node Majority  No Majority (Quorum Disk Only)  Node and File Share Majority (Good for Geo Clusters)
  • 21.
    Quorum Failure Tolerance Numberof Nodes 2 3 4 5 6 7 Node Majority 0 1 1 2 2 3 Node and Disk/File Share Majority 1 2 2 3 3 4 • Assuming Disk is Up Calculation is: Cluster Up = RoundUp(Total # of Nodes/2) • Assuming Disk is Down Calculation is: ClusterUp = RoundUp (Total # of Nodes/2)-1
  • 22.
    DR in SQL2008  Mirroring  Allowed automatic failover, but only one target  Mirror target is unreadable  Log Shipping  Allowed multiple targets, but failover a manual process, requiring a connection string change  Replication
  • 23.
  • 25.
    AlwaysOn Requirements  Windows Enterprise (Clustering is a requirement)  SQL Server Enterprise Edition  Windows Cluster  No shared storage is required  Quorum Disk Preferred
  • 26.
    Flexible AG Failover Similar to how a failover clustered instance fails over  Connects to instance every 30 seconds to perform health check  Also, similar quorum model to Windows Failover Clustering
  • 27.
    Allows for SANLess HA/DR  Thisisn’t a huge thing for SQL Server at big shops  It may allow us to incorporate a level of DR into a virtual environment
  • 28.
    Failover Modes  Automatic failover  Planned manual failover (without data loss)  Forced manual failover (with possible data loss)
  • 29.
    Failover Synchronous- Synchronous- Asynchronous- commit mode commit mode commit mode with manual- with automatic- failover mode failover mode Automatic No No Yes failover Manual failover No Yes Yes Forced failover Yes Yes No
  • 30.
    Client Connections inThis Model  Availability Group Listener (Yes, SQL Server now has a listener)  Works just like a failover clustering instance (single instance, single IP)  Creates a VCO (AD Virtual Computer Object)
  • 31.
    Contained Databases  Isolate Database from Instance  Currently only fully supported with SQL Logins  No numbered procedures  Eases database movement  Allows for ease of migration to Azure  Not quite baked out as of RC0
  • 32.
    Read Only Replicas Can have up to 3  SQL Client 2012 will allow for this routing specifically  Can take backups from read-only copys*  Copy Only Backups (only full copy, does not affect primary log)  Indexingmust be same on replicas  Bad queries can affect status of replica
  • 33.
    Considerations for Availability Groups  All SQL servers (including the secondary in the DR site) in the same Windows domain  All the databases must be in FULL recovery model  The unit of failover (for local HA, as well as DR) is at the AG level, i.e., group of databases – not the instance  Consider using Contained Database for containing logins for failover  For jobs and other objects outside the database, simple customization needed  No delayed apply on the secondary  Removing log shipping means the regular log backup job is removed  Need to re-establish periodic log backup (essential for truncating the log)  New tools for monitoring and alerting  AlwaysOn Dashboard  System Center Operations Manager
  • 34.
  • 35.
    Summary  Lotsof Changein the HA/DR Space  Licensing also changes—talk to your MS rep  SQL Server Failover Clusters still a good HA option  AlwaysOn Availability Groups add a lot more flexibility to DR
  • 36.
    Contact Info  @jdanton jdanton1@yahoo.com http://spkr8.com/s/19509

Editor's Notes

  • #9 Extended Events came out in SQL Server 2008, but very few people, myself included, paid much attention. Those who did found the implementation awkward and confusing. Only a few people persevered enough to discover just how powerful and amazing these things are. Which is why most anyone who wants to learn about extended events should plan on starting at one place, Jonathan Kehayias’ blog. Yeah, the Books Online help get you started, but Jonathan really makes it all take off.
  • #10 SQL Server clustering is the most obvious high availability solution that everyone knows about. However, mirroring between two SQL Servers (with a witness server) can also provide a level a both h/a and D/R. The other two options are a little bit more controversial and more complicated to setup. Both peer to peer replication and SQL Log Shipping can provide some measure of H/A, but there are caveats to this, and some data loss is possible. This is a little outside of the scope of this preso, so if you would like to know more detail around these topics, I highly recommend Paul Randal’s white paper on SQL HA and DR options. I’ll provide a link at the end of this presentation.
  • #11 DR Options—yes backup and recovery is your first line of defense in the event of a disaster. You should have extensive monitoring and notification around your backup process, and take regular transaction log backups, if you need point in time recovery.Mirroring is probably the best high availability option. With a witness server (a server that sits in between the two mirrors) you get automatic failover in the event of the failure of your primary instance goes down. Most applications that use Microsoft connections to your database can support mirroring. The only negative, is that unless you have enterprise edition, you are limited to synchronous mirroring, which can have a performance impact on your primary. Enterprise edition brings in asynchronous mirroring, which allows for greater flexibility and distance between sites with no performance impact.Log shipping and Replication—both of these will require manual intervention in the event of a failure. However, they are very mature technologies and can work over great distances. This is not a DR scenario, but I have an application which replicates from the US to Switzerland over a nominal network connection, running on SQL 2000, and I haven’t had to touch it in two years. (Knocks on wood).Lastly SAN replication—this is really cool technology, and can enable the concept of geo-distributed clusters (also covered in Paul’s white paper). This is pretty far out of scope for today’s presentation, but I’ll say this—while really cool, it’s really complex to setup, and really expensive. You need additional software from your SAN vendor, which is always pretty pricey, and the additional network bandwidth to transfer bits in real time over the network. When I was at Wyeth, we did this between Philadelphia and Pearl River NY for the SAP system that ran the business. But the cost made it prohibitive to do much else. Also, when it goes wrong, it can be ugly.
  • #29 Automatic failoverAutomatic failover is supported only when the current primary and one secondary replica are both configured with failover mode set to AUTOMATIC and the secondary replica currently synchronized. If the failover mode of either the primary or secondary replica is MANUAL, automatic failover cannot occur. Occurs only between a primary replica and a secondary replica that are configured for synchronous-commit mode and automatic failover mode when the secondary replica is in the SYNCHRONIZED state.Planned manual failover (without data loss)Planned manual failover, or manual failover, is useful for administrative purposes. It is supported only if both the primary replica and secondary replica are configured for synchronous-commit mode and the secondary replica is currently synchronized (in the SYNCHRONIZED state). A database administrator manually initiates a manual failover.Forced manual failover (with possible data loss)Intended only for disaster recovery, forced manual failover, or forced failover, is supported only when the synchronization health of the target availability replica either NOT_SYNCHRONIZING or SYNCHRONIZING. This is the only form of failover supported by in asynchronous-commit availability mode.Automatic failover setExists only when a pair of availability replicas (including the current primary replica) are configured for synchronous-commit mode with automatic failover, if any. An automatic failover set takes effect only if the secondary replica is currently SYNCHRONIZED with the primary replica. Synchronous-commit failover setExists only when a set of two or three availability replicas (including the current primary replica) are configured for synchronous-commit mode. A synchronous-commit failover set takes effect only if the secondary replicas are configured for manual failover mode and at least one secondary replica is currently SYNCHRONIZED with the primary replica. Entire failover setWithin a given availability group, the set of all availability replicas whose operational state is currently ONLINE, regardless of availability mode and of failover mode. The entire failover set becomes relevant when no secondary replica is currently SYNCHRONIZED with the primary replica.
  • #30 The amount of time that the database will be unavailable during a failover depends on the type of failover and its cause. For more information, see Estimate the Interruption of Service During Failover of an Availability Group (SQL Server). ImportantTo support client connections after failover, except for contained databases, logins and jobs defined on any of the former primary databases must be manually recreated on the new primary database. For more information, see Management of Logins and Jobs for the Databases of an Availability Group (SQL Server).
  • #32 A partially contained database is a contained database that allows the use of uncontained features. Partially contained databases do not allow the following actions or entities. Numbered proceduresSchema-bound objects that depend on built-in functions with collation changesBinding change resulting from collation changes, including references to objects, columns, symbols, or types.Replication, change data capture, and change tracking.Use the sys.dm_db_uncontained_entities and sys.sql_modules (Transact-SQL) view to return information about uncontained objects or features. By determining the containment status of the elements of your applications, you can discover what objects or features need to be replaced or altered for use in a fully contained database.