5. Single Copy Cluster (SCC) out-of-box provides little high
availability value
On Store failure, SCC restarts store on the same machine;
no clustered mailbox server (CMS) failover
SCC does not automatically recover from storage failures
SCC does not protect your data, your most valuable asset
SCC does not protect against site failures
SCC redundant network is not leveraged by CMS
Conclusion
SCC only provides protection from server hardware failures
and bluescreens, the relatively easy components to recover
Supports rolling upgrades without losing redundancy
6. 2. Inspect logs
Database Database
Log E00.log Log
E0000000012.log
E0000000011.log
1. Copy logs 3. Replay logs
Local Cluster Standby
File
Share
Log shipping to a local Log shipping to a standby
disk Log shipping within a cluster server or cluster
7. Manual AD site: Dallas
“activation” of
Client Access
remote mailbox Server DB4
Outlook (MAPI) OWA, ActiveSync, or server
DB5
client Outlook Anywhere
Standby
Server DB6
AD site: San Jose Mailbox server
can’t co-exist
Client Access
Server
with other roles
SCR
SCR managed
CCR #1 CCR #1 CCR #2
separately; no
CCR #2
Node A Node B Node A Node B GUI
Windows cluster Windows cluster Clustering
knowledge
DB1 DB1 DB4 DB4
required
DB2 DB2 DB5 DB5 Database failure
DB6
requires server
DB3 DB3 DB6
failover
8. Windows Failover Cluster
Default Cluster Clustered Mailbox
Group Server (CMS)
Cluster IP Address CMS IP Address
Cluster Name CMS Name
Cluster Quorum CMS resources
(exres.dll)
CMS disk resources
Cluster
Cluster
Networks
Database
9. Database Availability Group
Active Manager
PAM DAG Networks
SAM
Windows Failover Cluster
Default Cluster Group
Cluster IP Address Cluster
Cluster Name Database
Cluster Quorum
10. Database Availability Group
Mailbox Server Mailbox Server Mailbox Server
Get- Get- Get-
MailboxDatabaseCopyStatus MailboxDatabaseCopyStatus MailboxDatabaseCopyStatus
Move- Move- Move-
ActiveMailboxDatabase ActiveMailboxDatabase ActiveMailboxDatabase
Primary Active Manager Standby Active Manager Standby Active Manager
Storage Storage Storage
12. Reduce complexity
Reduce cost
Native solution - no single point of failure
Improve recovery times
Support larger mailboxes
Support large scale deployments
Make High Availability Exchange
deployments mainstream!
13. Improved mailbox uptime Key Benefits
• Improved failover granularity
• Simplified administration Easier and cheaper to deploy
• Incremental deployment
• Unification of CCR + SCR Easier and cheaper to manage
• Easy stretching across sites Better Service Level
• Up to 16 replicated copies Agreements (SLAs)
More storage flexibility
• Further Input/Output (I/O) Reduced storage costs
reductions Larger mailboxes
• RAID*-less/JBOD** support
Better end-to-end availability
• Online mailbox moves Easier and cheaper to manage
• Improved transport resiliency Better SLAs
*Redundant Array of Independent Disks (RAID) **Just a Bunch of Disks (JBOD)
14. AD site: Dallas
Client Access
Client All clients connect Server DB1
via CAS servers DB3
DB5
AD site: San Jose Mailbox
Server 6
Easy to
Client Access stretch across
Server (CAS) sites
Failover
managed within
Mailbox Mailbox Mailbox Mailbox Mailbox Exchange
Server 1 Server 2 Server 3 Server 4 Server 5
DB1 DB4 DB2 DB5 DB3
DB2 DB5 DB3 DB1 DB4
Database
(DB) centric
DB3 DB1 DB4 DB2 DB5
failover
17. A group of up to 16 servers hosting a set of replicated
databases
Wraps a Windows Failover Cluster
Manages servers’ membership in the group
Heartbeats servers, quorum, cluster database
Defines the boundary of database replication
Defines the boundary of failover/switchover (*over)
Defines boundary for DAG’s Active Manager
Mailbox Mailbox Mailbox Mailbox Mailbox
Server 1 Server 2 Server 3 Server 4 Server 16
18. Unit of membership for a DAG
Hosts the active and passive copies of multiple mailbox databases
Executes Information Store, CI, Assistants, etc., services on active
mailbox database copies
Executes replication services on passive mailbox database copies
Mailbox Mailbox Mailbox
Server 1 Server 2 Server 3
DB1 DB4 DB3
DB2 DB1 DB4
DB3 DB2
19. Unit of *over
A database has 1 active copy – active copy can be
mounted or dismounted
Maximum # of passive copies == # servers in DAG – 1
Mailbox Mailbox Mailbox
Server 1 Server 2 Server 3
DB1 DB4 DB3
DB2 DB1 DB4
DB3 DB2 DB1
20. Scope of replication
A copy is either source or target of replication at any given time
A copy is either active or passive at any given time
Only 1 copy of each database in a DAG is active at a time
A server may not host >1 copy of a any database
Mailbox Mailbox
Server 1 Server 2
DB1 X DB1
DB2 DB2
DB1
DB3 DB3
21. Primary Active Manager (PAM)
Runs on the node that owns the default cluster group
(quorum resource)
Gets topology change notifications
Reacts to server failures
Selects the best database copy on *overs
Standby Active Manager (SAM)
Runs on every other node in the DAG
Responds to queries from other Exchange components for
which server hosts the active copy of the mailbox database
22. Continuous replication has the following basic steps:
Database copy seeding of target
Log copying from source to target
Log inspection at target
Log replay into database copy
23. There are three ways to seed the target instance:
Automatic Seeding
Requires 1st log file containing CreateDB record
Update-MailboxDatabaseCopy cmdlet
Can be performed from active or passive copies
Manually copy the database
24. Log shipping in Exchange 2010 leverages Transmission
Control Protocol (TCP) sockets
Supports encryption and compression
Administrator can set TCP port to be used
Replication service on target notifies the active instance
the next log file it expects
Based on last log file which it inspected
Replication service on source responds by sending the
required log file(s)
Copied log files are placed in the target’s Inspector
directory
25. Active Manager selects the “best” copy to become
active when existing active fails
8
6
9
5
7
10
Catalog Crawling
Healthy
Copy status Healthy, DisconnectedAndHealthy,
DisconnectedAndResynchronizing,
or DisconnectedAndResynchronizing, or
SeedingSource
ReplayQueueLength
CopyQueueLength < 10 < 50
ReplayQueueLength < 50
26. Streaming backup APIs for public use have been cut, must use Volume
Shadow Copy Service (VSS) for backups
Backup from any copy of the database/logs
Always choose Passive (or Active) copy
Backup an entire server
Designate a dedicated backup server for a given database
Restore from any of these backups scenarios
Database Availability Group
Mailbox Mailbox Mailbox
Server 1 Server 2 Server 3
DB1 DB1 DB1
DB2 DB2 DB2
DB3 DB3 DB3
VSS requestor
27. Site/server/disk failure Exchange 2010 HA
Archiving/compliance E-mail archive
Recover deleted items Extended/protected dumpster
retention
Database Availability Group
Mailbox Mailbox Mailbox
Server 1 Server 2 Server 3
7-14 day lag copy
DB1 DB1 DB1
DB2 DB2 DB2
DB3 DB3 DB3
X
28. Legacy Deployment Steps (CCR/SCC) Exchange 2010 Incremental Deployment
1. Prepare hardware, install proper OS, 1. Prepare hardware, install proper OS,
and update and update
Extra for SCC: configure storage 2. Run Setup and install Mailbox role
2. Build Windows Failover Cluster 3. Create a DAG and replicate databases
Extra for SCC: configure storage 4. Test *overs
3. Configure cluster quorum, file share
witness, and public and private
networks
4. Run Setup in Custom mode and install
clustered mailbox server
5. Configure clustered mailbox server
Extra for SCC: configure disk
resource dependencies
6. Test *overs
29. Easy to add high availability to existing deployment
High availability configuration is post-setup
HA Mailbox servers can host other Server Roles
Reduces cost and complexity of HA deployments
Datacenter 1 Datacenter 2
Database Availability Group
Mailbox Mailbox Mailbox
Server 1 Server 2 Server 3
33. Exchange Management Shell
Create a DAG
New-DatabaseAvailabilityGroup -Name DAG1 -FileShareWitnessShare
CTD-CH02DAGWitness -FileShareWitnessDirectory C:DAGWitness
Add first Mailbox Server to DAG
Add-DatabaseAvailbilityGroupServer -Identity DAG1 -MailboxServer CTD-
MBX1 -DatabaseAvailablityGroupIpAddresses 192.168.1.100
Add second and subsequent Mailbox Server
Add-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer CTD-
MBX2
Add-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer
EXMBX3 -DatabaseAvailablityGroupIpAddresses
192.168.1.100, 192.168.2.10
Add Mailbox Database Copy
Add-MailboxDatabaseCopy -Identity MBXDB1 -MailboxServer CTD-MBX2
Extend as needed
34. HA Administration within Exchange
Recovery uses the same simple
operation for a wide range of failures
Simplified activation of Exchange
services in a standby datacenter
Reduces cost and
complexity of management
35. Managing Availability in the Exchange Management Console
1 Select a database
Take action (add
2 View locations and status of 3 copies, change
replicated copies
master, etc.)
40. 2 servers out -> manual
Single Site
activation of server 3
3 NodesDAG, quorum is lost
In 3 server
3 HA Copies
DAGs with more servers sustain
JBOD -> 3 physical Copies
more failures – greater resiliency
Mailbox Mailbox Mailbox
Server 1 Server 2 Server 3
X
Database Availability Group (DAG)
41. Member servers of DAG
CAS/HUB/ CAS/HUB/ can host other server roles
MAILBOX 1 MAILBOX 2
2 server DAGs, with server
DB2 roles combined or not, should
use RAID
42. With each release, our goals are to make Exchange
high availability:
Easier and cheaper to deploy
Easier and cheaper to manage
Support better SLAs with faster and more granular
recoveries
Improve site resiliency support
Our other goal is for highly available deployments to be
mainstream!
Slide Objective: Discuss Continuous Replication.Instructor Notes:Basic steps:Create a target database by seeding the destination with a copy of the source database (this can be accomplished via several ways – log record, streaming copy of database to passive, or offline copy).Monitor for new logs in the source log directory for copying by subscribing to Windows file system notification events.Copy any new log files to the destination inspection log directory using SMB.Inspect the copied log files.Upon successful inspection, move the copied log file to the storage group copy’s log path and replay the copied log into the copy of the database.
Slide Objective: Discuss the Exchange Server 2007 High Availability (HA)/SR solution.Instructor Notes:Key things to call outCMS cannot co-exist with other rolesClustered Exchange requires both Exchange and Windows Failover Clustering knowledge.Failovers occur at the server level.Site resilience is outside of clustering and therefore, requires manual activation.Acronyms Defined: Messaging Application Programming Interface (MAPI)Outlook Web App (OWA)Graphical User Interface (GUI)Cluster Continuous Replication (CCR)Standby Continuous Replication (SCR)
Slide Objective:Instructor Notes:In Exchange Server 2007 and previous versions, Exchange used the cluster resource management model to install, implement and manage the Mailbox server high availability solution. Historically, building a highly available Mailbox server involved first building a Windows Failover Cluster, and then running Exchange Setup in clustered mode. In this mode, the Exchange cluster resource DLL, exres.dll, would be registered and allow the creation of a clustered mailbox server (called an Exchange Virtual Server in legacy versions). When deploying legacy shared storage clusters or single copy clusters, additional steps for configuring storage were needed before and after failover cluster formation, and after clustered mailbox server and storage group formation.
Slide Objective:Instructor Notes:We no longer use the cluster resource model for Exchange high availability. All Exchange cluster resources provided by exres.dll no longer exist, including the construct known as a clustered mailbox server. We create a Windows Failover Cluster, but we do not create any additional groups for Exchange, and there are no storage resources in the cluster. Thus if you examine the cluster using cluster management tools, you’ll see only the core cluster resources (IP Address and Network Name, and if needed, quorum resource). Cluster nodes and networks will also exist, but those are managed by Exchange and not cluster or cluster tools.We use clusapi.dll for cluster, group, cluster network (heartbeating), node management, cluster registry, and a few control code functions. Active Manager stores current mailbox database information (e.g., active and passive data, and mounted data) in the cluster database. Although the information is stored directly in the cluster database, it must not be accessed directly by any integration components.
Slide Objective:Instructor Notes:A DAG is a boundary for replication in Exchange 2010. This can be the built-in continuous replication, or it can be 3rd party synchronous replication that uses the built-in Exchange 3rd party Replication API. Active Manager manages locating databases and handles failovers of replicated databases.
Slide Objective: Discuss Exchange 2010 high availability architecture.Instructor Notes:NOTE: “You can't replicate outside the DAG.” (key difference from SCR)Here is Harv’s current Exchange environment.There are 5 servers <click> in the main datacenter that host mailboxes. These mailbox servers are grouped to provide automatic failover. Each mailbox database has 3 instances, which we’ll refer to as copies, <click> placed on separate servers to provide redundancy. At any given time, only 1 of the 3 database copies is active <click> and accessible to clients.The Front-End Server <click> manages all communications between clients and databases. Outlook clients no longer connect directly to mailbox servers, as they did in previous versions of Exchange.When a client such as Outlook connects <click> to Exchange, it first contacts the Front-End Server. The Front-End Server determines <click> where the user’s active database is located, and forwards the request <click> to the appropriate server. When the client sends an e-mail <click>, the active database is updated. Then, through log shipping <click>, the other 2 passive copies of the database are updated.Let’s say that a disk fails <click>, affecting one of the databases on Mailbox Server 2. In previous versions of Exchange, the administrator would need to failover all the databases on Mailbox Server 2 to recover from this failure, or else restore the Database 2 from a tape backup. However, Exchange’s new architecture supports database-level failover, so Database 2 automatically fails over to Mailbox Server 1 <click> without affecting the other databases. The Outlook client, having lost its connection to the database, automatically contacts <click> the Front-End Server to reconnect. The Front-End Server determines which mailbox server has the active copy of the users’ database. It connects <click> the client to Mailbox Server 1. When new mail is sent <click>, the active database on Mailbox Server 1 is updated. The second copy of the database <click> is also updated through log shipping. The end user is unaware that anything has happened, and Harv can replace the failed disk drive at his leisure. The administrator can set up any number of copies per database to meet the Service Level Agreements for his users. For a special category of users, Harv keeps a 4th database copy on a mail server in a geographically remote location <click>. This server is located in a different Active Directory site, but is kept up-to-date over the Wide Area Network using the same replication technology as the other servers. If a hurricane, earthquake, or other catastrophe should shut down the main datacenter, this remote server can be manually activated and readied for client access in about 15 minutes.Note: If a focus group participant asks “How is this different from Microsoft Clustering Services?”, here’s the answer: “The product team is taking technology from Microsoft Clustering Services and integrate it natively into Exchange Server (no separate management tools required). Would you find value in that? Why or why not?”
Slide Objective:Position the new High Availability (HA) model as the evolution of previous HA methods—with significantly less cost and complexity.Instructor Notes:SituationContinuous Replication, which was first introduced in Exchange Server 2007, made it significantly cheaper to deploy a highly available Exchange infrastructure (removing the need to rely on 3rd party data replication products or expensive shared-storage clustering)The various “flavors” of high availability and disaster recovery in Exchange Server 2007 (Cluster Continuous Replication, Standby Continuous Replication, etc) must be managed separatelyRunning a highly available Exchange infrastructure still requires a great deal of time and expertise, because integration between Exchange Server and Windows Clustering is not seamlessTalking Points:Exchange “14” uses the same Continuous Replication technology found into Exchange Server 2007, but unites on-site (CCR) and off-site (SCR) data replication into a one frameworkExchange server manages all aspects of failover. Clustering is “under the hood” rather than “bolted on the side” Increases the number of replicated data copies to 16, makes failover more granular (database-level rather than server level)Smaller deployments that once required 4 separate servers for redundancy of Exchange server roles now need just 2 serversLegacy Exchange clustering (Single copy clustering, which was the only clustering option in Exchange 2000 and Exchange Server 2003) is being retired in favor of Exchange Server 2007-style clustering
Slide Objective:Instructor Notes:
Slide Objective:Instructor Notes:
Slide Objective:Instructor Notes:Defines properties applicable to the DAG or all servers, e.g..:File Share WitnessList of serversNetwork compressionNetwork EncryptionSupports multiple networks
Slide Objective:Instructor Notes:
Slide Objective:Instructor Notes:
Slide Objective:Instructor Notes:
Slide Objective:Instructor Notes:PAM: PAM is the Active Manager in the DAG which decides which copies will be active and passive. It moves to other servers if the server hosting it is no longer able to. You need to move the PAM if you take a server offline for maintenance or upgrade. PAM is responsible for getting topology change notifications and reacting to server failures. PAM is a role of an Active Manager. If the server hosting the PAM fails, another instance of Active Manager adopts the role (the one that takes ownership of the cluster group). The PAM controls all movement of the active designations between a database’s copies (only one copy can be active at any given time, and that copy may be mounted or dismounted). The PAM also performs the functions of the SAM role on the local system (detecting local database and local Information Store failures).SAM: Provides information on which server hosts the active copy of a mailbox database to other components of Exchange, e.g.., RPC Client Access Service or Hub Transport. SAM detects failures of local databases and the local Information Store. It reacts to failures by asking the PAM to initiate a failover (if the database is replicated). A SAM does not determine the target of failover, nor does it update a database’s location state in the PAM. It will access the active database copy location state to answer queries for the active copy of the database that it receives from CAS, Hub, etc.Note: Even on standalone mailbox servers, Active Manager is utilized, which is why Replication Service cannot be stopped.
Slide Objective:Instructor Notes:
Slide Objective:Instructor Notes:Now that we have a basic understanding of the core components that are involved, we can discuss how log shipping works. Before log files can be shipped to a passive copy, the active database copy must first be seeded. This can be accomplished in a few ways.Automatic seeding An automatic seed produces a copy of a database in the target location. Automatic seeding requires that all log files, including the very first log file created by the database (it contains the database creation log record), be available on the source. Automatic seeding only occurs during the creation of a new server or creation of a new database (or if the first log still exists, i.e. log truncation hasn’t occurred).Seeding using the Update-MailboxDatabaseCopy cmdlet You can use the Update-MailboxDatabaseCopycmdlet in the Exchange Management Shell to seed a database copy. Manually copying the offline database This process dismounts the database and copies the database file to the same location on the passive node. If you use this method, there will be an interruption in service because the procedure requires you to dismount the database.The second option utilizes the streaming copy backup API to copy the database from the active location to the target location.
Slide Objective:Instructor Notes:Exchange Server 2007 utilized Server Message Block (SMB) and Windows file system notifications to get logs. Exchange 2010 utilizes TCP sockets and notifications to the source about which logs are required on the target.
Slide Objective:Instructor Notes:we are trying to find the copy with least data loss that meets the status criteria (i.e. disconnected/healthy, CI healthy, CopyQueueLength, etc). For point 2, emphasize that the goal is to minimize data loss. For point 4, you should clarify that multiple passes of checks are run on the sorted list of copies to pick one. We start with the most restrictive set of checks (i.e. copy has to be healthy/disconnected, has to have healthy CI, etc..) and drop down to the most relaxed (i.e. we don’t care about copy queue length or CI health… just mount!).
Slide Objective:Instructor Notes:
Slide Objective:Instructor Notes:
Situation:In previous versions of Exchange, enabling clustering on an existing mailbox server would have required the administrator to move all mailboxes off the server, tear it down, and then reinstall Exchange. Talking Points:Well, Tomas’ early days as an Exchange administrator, he had a simple environment with one mailbox server. This mailbox server was not part of a cluster.When Tomas decided to implement clustering, he feared that this might involve a great deal of work. In previous versions of Exchange, enabling clustering on an existing mailbox server would have required him to move all mailboxes off the server, tear it down, and then reinstall Exchange. But because of Exchange’s new architecture, Tomas simply had to add a second mailbox server to his environment <click> and place it in a database availability group <click> with Mailbox Server 1. Then he set up replication <click> for the existing databases on Mailbox Server 1. The process was quick, intuitive, and required no reinstallation. To extend this to become a site reliant solution – Tomas can simply add a third mailbox to a separate site in his environment <click> and add it to the DAG. Then he sets up replication <click> for the existing databases.<click> Multiple server roles can also co-exist on servers that provide high availability. This enables small organizations to deploy a two-server configuration provides full redundancy of mailbox data, while also providing redundant Client Access and Hub Transport services.So it is easy to extend an existing Exchange 2010 to include high availability and this reduce both the cost and complexity of the HA deployment.No subnet or special DNS requirements!Slide Objective:Administrators can add high availability to their Exchange environment after their initial deployment, without reinstalling servers.
Talking Points:When creating a DAG, you will provide a name for the DAG, specify a UNC path for a file share witness share and a directory name for the DAG's files share witness, and then configure network encryption and compression settings. You do not need to pre-create the share or directory for the file share witness. Simply specify a UNC path for the share (e.g., \\\\ServerName\\ShareName) and directory (e.g., C:\\FolderName) that will be shared at this path and Exchange will create them both for you. The server hosting the file share witness and directory cannot be a member of the DAG that is using that share and directory. We recommend that you use a Hub Transport server to host the file share witness, as this allows an Exchange administrator to be aware of the availability of the file share witness.Note:The final version of the release product may change this dialog so that Exchange automatically creates the FSW share and directory.Slide Objective:Creation of a Database Availability Group through the EMC is simple and intuitive.
Talking Points:Step 2 – Add servers to the DAGSlide Objective:Creation of a Database Availability Group through the EMC is simple and intuitive.
Talking Points:Step 2 – Add servers to the DAGTo add a Mailbox server to a DAG, the Mailbox server must be running the Windows Server 2008 Enterprise operating system or the Windows Server 2008 Datacenter operating system, must not belong to any other DAGs, must be in the same Active Directory domain as all other Mailbox servers in the DAG, and must have at least two properly configured networks on separate subnets that can communicate with other networks on other servers in the DAG.Note: This process may not work with the Beta version of the product – use the Local PowerShell commands to add the server to the DAG. Later builds (including RTM) will enable this functionality through EMC.You can add database copies for databases which are hosted on the mailbox servers within the DAG.Slide Objective:Creation of a Database Availability Group through the EMC is simple and intuitive.
Talking Points:Exchange Management Shell commands to do add a server to a DAGSlide Objective:Creation of a Database Availability Group through the EMS enables the process to be scripted if required.
Situation:Fear of Windows Clustering keeps many Exchange customers from deploying high availability. To deploy Exchange with Windows clustering, administrators had to master new concepts (such as floating network identities and cluster resources) and use separate management tools. Talking Points:In order to reduce the complexity of managing the new high availability environment, the administration tools have been simplified. While in the past the administration was spilt between Windows Failover Cluster Manager and Exchange, all the HA configuration and management is now done through the Exchange Management tools. The most common tools are all available in the EMC GUI interface, and as all tasks can be scripted or run through via cmdlets.Whether there is a disk failure, a server failure or perhaps a network failure, the same mechanism is used to failover the affected database or databases onto a server which holds a copy of that particular database.In the case of a Full datacenter failure, additional actions will be required to bring the full site online, but since the mailbox servers are all part of the same DAG the failover process for the for the mailbox servers and data the process is the same as a failover within the same site.Slide Objective:Simplified administration of HA reduces the cost and complexity of HA management.
Talking Points:Formerly you would manage mailbox databases on the “server configuration” tab, now this is managed through the organization configuration tab.Note that the concept of a storage group has been removedThrough the admin tool you … Can see how healthy are the databases On which servers do the copies of a database reside Can see what databases reside on a particular serverIt also provides a list of actions which can be taken against a particular database like adding database copies, moving the database master etc.Slide Objective:Simplified administration of HA reduces the cost and complexity of HA management.