Presentation by Michael Van Horenbeeck: http://twitter.com/mvanhorenbeeck. Video recording available here: http://technet.microsoft.com/en-us/video/windows-server-2012-improvements-in-failover-clustering.
Support for up to 64 physical nodes and up to 4,000 virtual machines per cluster (compared with 16 physical nodes and 1,000 virtual machines in Windows Server 2008 R2)Cluster integration with Windows Server 2012 Server Manager to discover and manage all the nodes of the clusterFailover Cluster Manager features to manage large-scale clustersAutomated management settings for clustered virtual machines and other clustered rolesLive migration of virtual machine storageSupport for Hyper-V Replica (which provides automated replication of virtual machines between storage systems, clusters, and data centers)Policies to failback clustered roles to a node following maintenance operations
The updated Failover Cluster Manager snap-in simplifies large-scale management of clustered virtual machines and other clustered roles. The new features include the following:- Features for managing large numbers of virtual machines or other clustered roles. Search, filtering, and custom views in Failover Cluster Manager make it easier to manage large numbers of clustered virtual machines or other clustered roles.- Multiselect. Administrators can easily select a specific collection of virtual machines and then perform any needed operation on them (live migration, save, shutdown, or start).- Simplified live migration and quick migration of virtual machines. Live migration and quick migration are easier to perform from within Failover Cluster Manager.- Simpler configuration of Cluster Shared Volumes (CSV). Configuring CSVs is a simple right-click in the Storage pane.New Windows PowerShell cmdlets support capabilities in Failover Clustering including the following:- Managing cluster registry checkpoints, including cryptographic checkpoints-Creating Scale-Out File Servers, which provide continuously available and scalable file-based server application storage - Monitoring of virtual machine applications - Updating the properties of a Distributed Network Name resource- Creating a highly available iSCSI Target Server
Administrators can now control the way that the cluster handles virtual machines and other clustered roles by assigning a priority to each clustered role. The possible priorities are:HighMediumLowNo Auto StartWhen a clustered role is created, the default priority is Medium.By assigning priorities to clustered roles, administrators can influence:Start order of roles. Virtual machines or clustered roles with higher priority are started before those with lower priority.Placement order of roles. Virtual machines or clustered roles with higher priority are placed on appropriate nodes before virtual machines or clustered roles with lower priority.When the whole cluster is restarted, multiple roles must be placed on multiple nodes in the cluster. If a node crashes or is evicted, multiple roles must be placed on the remaining nodes in the cluster. The placement order of these roles is determined by their priority setting.If a No Auto Start priority is assigned to a clustered role, the role does not start automatically (does not come online) after it fails, which keeps resources available so other roles can start.
The Validate a Configuration Wizard in Failover Cluster Manager simplifies the process of validating hardware and software across servers for use in a failover cluster. The performance for large failover clusters has been improved, and new tests have been added.The following aspects of validation have been improved: Faster validation: Validation tests, especially storage validation tests, run significantly faster.Targeted validation of new LUNs: Administrators can target validation of a specific new LUN (disk), rather than testing all LUNs every time they test storage.Integration of validation with WMI: Cluster validation status is now exposed through Windows Management Instrumentation (WMI), so that applications and scripts can programmatically consume it.New validation tests for CSV: Validation tests help administrators confirm that their configuration meets the requirements for CSVs.New validation tests for Hyper-V and virtual machines: Validation tests help administrators determine whether the servers in their cluster are compatible for Hyper-V purposes, that is, whether the servers will support smooth failover of virtual machines from one host to another.What value do these changes add?The added validation tests help confirm that the servers in the cluster will support smooth failover, particularly of virtual machines from one host to another.
High Availability > Continuous AvailabilityCost and availability continuum – pay for the availability you need Build on standard, lower-cost, high-volume hardware componentsOS platform enables partners to deliver a wide range of solutionsEasy for customers to procure, deploy and manageRecoverable device failures and failure isolation Transparent and fast recovery from failures without service disruptionDynamic Cluster Quorum Model > Last Man standing as available as possible
SMB2 direct. This improvement uses a special type of network adapter that has remote direct memory access (RDMA) capability and can function at full speed with very low latency, while using very little CPU. For server roles or applications such as Hyper-V or SQL Server, this allows a remote file server to have performance that compares to local storage.SMB2 multichannel. This improvement allows aggregation of network bandwidth and network fault tolerance if multiple paths are available between the SMB 3 client and the SMB 3 server. Server applications can then take advantage of all available network bandwidth and be resilient to a network failure.
Key benefits provided by Scale-Out File Server in Windows Server 2012 include:Active-Active file shares All cluster nodes can accept and serve SMB client requests. By making the file share content accessible through all cluster nodes simultaneously, SMB 3.0 clusters and clients cooperate to provide transparent failover to alternative cluster nodes during planned maintenance and unplanned failures with service interruption.Increased bandwidth The maximum share bandwidth is the total bandwidth of all file server cluster nodes. Unlike previous versions of Windows Server, the total bandwidth is no longer constrained to the bandwidth of a single cluster node, but rather the capability of the backing storage system. You can increase the total bandwidth by adding nodes.CHKDSK with zero downtime CHKDSK in Windows Server 2012 is significantly enhanced to dramatically shorten the time a file system is offline for repair. Clustered shared volumes (CSVs) in Windows Server 2012 take this one step further and eliminates the offline phase. A CSV File System (CSVFS) can perform CHKDSK without impacting applications with open handles on the file system.Clustered Shared Volume cache CSVs in Windows Server 2012 introduces support for a read cache, which can significantly improve performance in certain scenarios, such as Virtual Desktop Infrastructure.Simpler management With Scale-Out File Servers, you create the Scale-Out File Server and then add the necessary CSVs and file shares. It is no longer necessary to create multiple clustered file servers, each with separate cluster disks, and then develop placement policies to ensure activity on each cluster node.
New protocol version: SMB 3.0SMB 3.0 Client (Redirector)Client operation replayEnd-to-end support for replay of idempotent and non-idempotent operationsSMB 3.0 ServerSupport for network state persistenceSingle share spans multiple nodes (active-active shares)Files are always opened Write-ThroughResume Key – used on failover to:Resume handle state after planned or unplanned failoverFence handle state informationMask some NTFS failover issuesWitness ProtocolEnables faster unplanned failover because clients do not wait for timeoutsEnables dynamic reallocation of load
I’m not going to demo this feature at this time. Later, when I’ll be demoing the cool new continuous available file share, I’ll also show this feature.
Failover Clustering is delivering the infrastructure for the Private CloudMost scalable private cloudFlexible deployment choices Intelligent placement across the private cloudNext generation Cluster Shared Volumes (CSV)
Improvements in Failover Clustering in Windows Server 2012
What? Why?Multiple individual computers Support business needs by avoidingworking together to increase the downtime (increasing availability)availability (and scalability) of aclustered service. Result + Complexity + Management + Cost
Management of the private cloudHyper-V Platform of the private cloud Infrastructure of the private cloud
4,000 VM’s in a single cluster Scale Up Scale Out ... 64 nodes in a cluster
Concurrent Live Migrations: Multiple simultaneous LM’s for Live Migration Queuing: a given source or target In-box tools queue & manage large numbers of VMs Storage Live Migration: VHD VHDMoves VHD’s from one disk to another Hyper-V Replica: Point-in-time replication of VHD’s for disaster recovery
Node N Node 2 Node 1 Node N Node 2 Hyper-VShared Node 1 VMJBOD Mgmt OS Platfor File VM HB Data Clusteri Server & NIC NIC A m VSwitc Manag ng Networkin hExternal HB Storag NIC NIC A ement gStack VMStorage e Arrays VMBus VM
SQL Server fs1share fs1share File Server Node File Server Node A B File Server Cluster
SQL Server fs1share fs1share File Server Node File Server Node A B File Server Cluster
• Witness Service User Witness Kernel Protocol (new) SMB 3.0 Server Witness• Client User User Kernel Kernel State Operation replay SMB 3.0 persistence SMB2 Redirector SMB3 Server• Resume Key Filter File System SMB 3.0 Client SMB 3.0 Server
1. Failover Take2. Resources offline3.4.5. Put Patch online Restart