/ What is SQL server Clustering?
Highly Available Database Solution
ﬁgcéolfiserver ygcéoltserver ﬁSiCO°| f:Server
Windows Windows Windows Windows
Server 2012 R2 Server 2012 R2 Server 2012 R2 Server
Befits of SQL Cluster
Protection at the instance level through redundancy
. « Automatic failover in the event of a failure (hardware failures,
operating system failures, application or service failures)
3 Zero reconfiguration of applications and clients during failovers
, . Flexible failover policy for granular trigger events for automatic
fai| oversRedundant components
.3 Configurability and predictability in failover time through indirect
. . No more Hardware Compatibility List required!
Prerequisites for Installing SOL Server Cluster (FCI)
; + Windows clustered configured/ installed
Cluster Validation (If you need MS support)
r Network name (SQL Server Virtual Name)
2» IP address
SOL 2005 Cluster Installation Setup
I Microsoft Microsoft Microsoft I
I SQL Server 2005 SQL Server 2005 SQL Server 2005 SQL Server 2005 I
SOL 2008 and Later Version Cluster Installation Setup
New SCI. Scner TIIIDTEV (luster installation
a Launch a wizard to insull a single-node SQL Server ZOIZ
V Add nodeto a SQL $¢n. er failover cluster
Launch a wizard to add a node to an eiiisting SQL Server
3012 failover (luster.
N M'“°5°U'K . Microsoft'K' ‘ Microsoft'K' ’ Microsofrkéy ' U
I QL5€FV€T'2012 QLServer'2o12 SQLServer'2o12 QLServer'2012 '
Basic Components you see in Cluster manager for SQL
' Network name e Active Directory
0 | P address T» Domain Name Server (DNS)
° Shared disks —> Storage Area Network (SAN) SQL Server Components
° SQL Server Database Engine service
° SQL Server Agent service
- SQL Server Analysis Services service, if installed
- One file share resource, if the FILESTREAM feature is
How does automatic failover works? (Prior SQL 2012)
LooksAlive is a quick lightweight health check
By default run at interval of 5 Seconds
Does not impact performance but does not perform a thorough check
The check will succeed if the service appears to be running even though
it might not be operational
If Fails calls ”| sA| ive" health check
Polling interval can be changed by adjusting LooksAlivePo| llnterva|
property of Cluster service
Run at interval of each 60 Second
Perform more detail check then LooksAlive
Run @@SERVERNAME to ensure that SQL Server is responding to
| sA| ive queries
> Does not ensure that all user databases are operational
> If the check fails, the SQL Server resource fails & Windows Cluster
service will try to bring it online on other node as per configuration
> Polling interval can be changed by adjusting lsA| ivePo| |lnterva| property
of Cluster service
How does automatic failover works? (SQL 2012 and Later)
State of the SQL Server service
The WSFC service monitors the start state of the SQL Server service on the active FCI node to detect when the SQL Server
service is stopped.
Responsiveness of the SQL Server instance
The resource DLL calls the sp_server_diagnostics stored procedure and sets the repeat interval to one-third of the
Hea| thCheckTimeout (Default 60 Sec. and minimum 15 Sec. ) setting.
If the dedicated connection is lost, the resource DLL will retry the connection to the SQL instance for the interval specified by
Hea| thCheckTimeout before it reports to the WSFC service that the SQL instance is unresponsive.
SQL Server component diagnostics (Failover Policy)
Based on Event sp_server_diagnostics retur
Benefits Of New Failover Policy
Flexible Failover Policy provides administrators control over the conditions when
an automatic failover should be initiated.
SQL Server 2008 R2 5Q'- 5‘"‘’°' 2°12
SQL Server SQL Server
Configurable options eliminate false failover
Improved logging for better diagnostics
Failure Condition Levels
No Automatic Failure
No data from sp_server_diagnostics
3* System Unhealthy stack dumps occurring
Return ‘system error’
Resource Unhealthy Low on Memory
Return ‘resource error’
Instance Not Started
5 Failover or restart on any qualified failure 17884 Deadlock
Return ‘query processing error’
Sequence of Events During Failover
1. Unless a hardware or system failure occurs, all dirty pages in the
buffer cache are written to disk.
2. All respective SQL Server services in the resource group are
stopped on the active node.
3. The resource group ownership is transferred to another node in
4. The new resource group owner starts its SQL Server services.
5. Client application connection requests are automatically directed
to the new active node using the same virtual network name
Configuring FailureConditionLevel Property Settings
$fci = "SQL Server (INST1)"
Get—ClusterResource $fci l Set—ClusterParameter FailureConditionLevel 3
Failover Cluster Manager Snap—in
' Open the Failover Cluster Manager Snap—in.
° Expand the Services and Applications and select the FCI.
° Right—click the SQL Server resource and then select Properties.
' Select the Properties tab, enter the desired value for the
FaliureConditionLevel property, and then click OK to apply the change.
ALTER SERVER CONFTGURATION SET FEILOVER CLUSTER PROPERTY
FailureConditionLevel = 0;
Captures diagnostic data and health information about SQL Server to detect potential failures.
EXEC SP_SERVER_DIAGNOSTICS [@REPEAT_INTERVAL = ]'REPEAT_INTERVAL_IN_SECONDS'
C°mP°“e“t_tYPe Either Instance OR A| waysOn Avai| abi| ityGroup
System, Resource, Query_processing, |O_subsystem, Events,
c°mP°nent‘name <name of the availability group>
State O: Unknown, lzclean, 2:warning, 3:error
Data Specifies data that is specific to the component.
sp_server_diagnostics continued. ... .
System Spinlocks, severe processing conditions, non—yielding tasks, page faults, and CPU usage.
Resource Physical and virtual memory, buffer pools, pages, cache and other memory objects
Querv_Dr0cessing Threads, tasks, wait types, CPU intensive sessions, and blocking tasks
| O_subsystem Clean healthy or warning health state only for an IO subsystem.
Errors and events of interest recorded by the server, including details about ring buffer
Events exceptions, ring buffer events about memory broker, out of memory, scheduler monitor,
buffer pool, spinlocks, security, and connectivity . Events will always show 0 as the state.
New in 2012 and 2014
> SQL Server multi-subnet clustering
> SMB file share is a supported storage option
> Local Disk is now a supported storage option for tempdb
for SQL Server failover cluster installations
> Flexible failover policy for cluster health detection
> Indirect checkpoints
> Failover cluster instances (FC| s) can now use Cluster Shared
Volumes (CSVs) as cluster shared disks.
> SysPrep now supports failover cluster installations
How To Troubleshoot
> For all Role/ Resource Group errors and warnings are
reported in Cluster Events in Failover Cluster Manager
> Instance specific error and warning events are reported in
Actions pane with name “Show Critical Events” and
“| nformation Details” when specific role is selected in
Failover Cluster Manager snap-in.
> To get detailed cluster logs PowerShe| | Cmdlet Get-
Clusterlog —UseLocaITime can be used. By default it
creates the logs under ‘%windir%C| usterReports’ path.
> Troubleshooting sql related error, use SQLDIAG xel files
created by Resource. d|l with sp_server_diagnostics
> Also refer sql server error logs and windows events logs.
Set Failover Policy
Perform the failover