2. Module Overview
• Overview of High Availability Options
• Configuring Highly Available Mailbox Databases
• Deploying Highly Available Non-Mailbox Servers
• Deploying High Availability with Site Resilience
3. Lesson 1: Overview of High Availability Options
• What Is High Availability?
• Discussion: Components of a High Availability Solution
4. What Is High Availability?
High availability:
• Implements system design that ensures a high level of
operational continuity
• Is measured by the percentage of time the application is
available
Availability Target Permitted Annual Downtime
99% 87 hours, 36 minutes
99.9% 8 hours, 46 minutes
99.99% 52 minutes, 34 seconds
99.999% 5 minutes, 15 seconds
5. Discussion: Components of a High Availability
Solution
• Which components are important for running a high
availability solution?
• What are some single points of failure in a messaging
solution?
6. Lesson 2: Configuring Highly Available Mailbox
Databases
• What Is a Database Availability Group?
• What is Quorum?
• What Is Active Manager?
• What Is Continuous Replication?
• How Are Databases Protected in a DAG?
• Configuring a Database Availability Group
• Configuring Databases for High Availability
• Demonstration: How to Create and Configure a DAG
• What Is the Transport Dumpster?
• Understanding the Failover Process
• Designing Monitoring and Management for a DAG
• Demonstration: How to Monitor Replication Health
7. What Is a Database Availability Group?
A DAG is a collection of servers that provides the infrastructure
for replicating and activating database copies. DAGs:
• Require the failover clustering feature, although all
installation and configuration is done with the Exchange
Server management tools
• Use Active Manager to control failover
• Use an enhanced version of the continuous replication
technology that Exchange Server 2007 introduced
• Can be created after the Mailbox server is installed
• Allow a single database to be activated on another server
in the group without affecting other databases
• Allow up to 16 copies of a single database on separate
servers
• Define the boundary for replication
8. What Is Quorum?
Quorum defines consensus that enough cluster members
are available to provide services
Exchange Server 2010 DAG quorums:
• Are based on votes in Windows Server 2008
• Allow nodes, file shares, and shared disks to have votes,
depending on the quorum mode
• Use node majority with a witness server for quorum:
• DAGs with an even number of Mailbox servers use the
witness server
• DAGs with an odd number of Mailbox servers use node
majority
9. What Is Active Manager?
Active Manager:
• Runs a process on each server in the DAG
• One node is the PAM
• Remaining nodes are SAM
• Manages which database copies are active and which are
passive
• Stores database state information
• Manages database switchover and failover processes
• Does not require direct administration configuration
10. What Is Continuous Replication?
Database Availability Group
DB1 DB1 DB1
File Mode
Block Mode
Replication Log Buffer
Replication Log Buffer
ESE Log Buffer
11. How Are Databases Protected in a DAG?
Continuous replication protects databases across
servers in the DAG
DB1 DB2 DB2
DB2 DB3 DB3
DB4 DB4 DB4
12. Configuring Database Availability Group
To configure DAGs you must define the following:
• Witness Server – Server used to store witness
information
• Witness Directory – directory used on the witness server
to store witness information
• Database availability group IP addresses – IP
address(es) used by DAG
Additionally consider these settings for larger or multi-site
implementations:
• DAG Networks including replication
• DAG Network Compression
• DAG Network Encryption
• Third-Party Replication Mode
• Alternative Witness Server
• Alternative Witness Directory
13. Configuring Databases for High Availability
After creating a DAG, adding Mailbox servers to the DAG, and
configuring the DAG, you must still do the following:
• Create database copies
• Set truncation lag time
• Set replay lag time
• Set preferred list sequence number
14. Demonstration: How to Create and Configure a
DAG
In this demonstration, you will see how to create and
configure a DAG
15. What Is the Transport Dumpster?
The transport dumpster:
• Protects against Mailbox server failures when
transaction logs have been lost
• Keeps copies of all messages delivered in the transport
queue (mail.que) until the transaction logs have
replicated to all servers in the DAG, or until the
maximum dumpster size is reached
• Redelivers missing email messages when a failure
occurs
16. Understanding the Failover Process
If a failure occurs, the following steps occur for the failed
database:
Active Manager determines the best copy to activate
The replication service on the target server attempts to
copy missing log files from the best “source”:
• If successful, the database mounts with zero data loss
• If unsuccessful (failover), the database mounts based
on the AutoDatabaseMountDial setting
The mounted database generates new log files (using
the same log generation sequence)
Transport dumpster requests are initiated for the
mounted database to recover lost messages
When original server or database recovers, it
determines if any logs are missing or corrupt, and fixes
them if possible
17. Designing Monitoring and Management for a DAG
• Allocate the necessary permissions for managing
a DAG
• Organization Management
• DAGs
• Database copies
• Failure may not be noticed
• Exchange Server 2010 SP1 or newer includes
several scripts and commands for DAG monitoring
and management
• Consider using System Center Operations
Manager 2012
18. Demonstration: How to Monitor Replication Health
In this demonstration, you will see how to:
• Monitor replication health using the Exchange
Management Console and the Exchange Management
Shell
• View various status messages
• View available statistics
19. Lesson 3: Deploying Highly Available Non-Mailbox
Servers
• How High Availability Works for Client Access Servers
• How Shadow Redundancy Provides High Availability for
Hub Transport Servers
• How High Availability Works for Edge Transport Servers
20. How High Availability Works for Client Access
Servers
A client access array is created with multiple Client Access
servers. You can achieve high availability and load balancing by
using one of these methods:
• Software-based NLB
• Hardware-based NLB
• Round-robin DNS
To configure a client access array:
• Use the New-ClientAccessArray cmdlet
• Configure existing databases using the Set-
MailboxDatabase cmdlet with the RpcClientAccess
parameter
• Configure internal URIs for Exchange services
21. How Shadow Redundancy Provides High
Availability for Hub Transport Servers
Transport server delays message deletion until it verifies that
the message has been delivered past the next hop
s
tu
d sta
ar 2.
d isc De
liv
ry er
Q ue
3. to
ne
e1 xt
Ed g ho
r to p
elive Edge1
1. D
4.
If
fa
ilu
Hub re
,r
es
ub External
m
it SMTP Mail
Server
Edge2
22. How High Availability Works for Edge Transport
Servers
Load balancing and high availability methods for Edge Transport
include:
• Multiple DNS MX records that are created to specify
multiple authoritative SMTP servers for the domain.
domain
• Hardware-based load balancing that is used to load
balance inbound SMTP connections to any available Edge
Transport server.
server
23. Lesson 4: Deploying High Availability with Site
Resilience
• Requirements for Creating a Multiple Site DAG
• What Is Datacenter Activation Coordination Mode?
• Deploying Exchange 2010 for Site Resilience
• Switchover and Switchback Process with Site Resilience
• Best Practices for Site Resilient Solutions
24. Requirements for Creating a Multiple Site DAG
Requirements include:
• At least one Mailbox server in each site
• Round-trip network latency time of maximum 500
milliseconds between DAG members
• Other server roles must be available in each site
• DAC mode for DAGs that span multiple locations
• Prevents split-brain syndrome
25. What Is Datacenter Activation Coordination
Mode?
DAC mode:
• Prevents split-brain syndrome
• Uses the DACP Protocol to decide if a database can be
mounted
• 0 : Database cannot be mounted
• 1 : Database can be mounted
No DACP=1, no
database mounted
Mount
database
DACP=1
DACP=0 DACP=1 DACP=1
DACP=1
DACP=0
Data center DAG in DAC Data center
Data center
Data center DAG in DAC 2
11 Mode
Mode 2
26. Deploying Exchange 2010 for Site Resilience
Site resiliency:
• Requires the following server roles to be
available in each site (besides the Mailbox role):
• Active Directory Domain Controller
• Hub Transport server
• Client Access server
• Does not require any special configuration for
Hub Transport and Client Access servers
• The Edge Transport server:
• Requires Internet connectivity for the
alternate data center
• Requires multiple MX records for incoming
messages
27. Switchover and Switchback Process with Site
Resilience
Site A Site B
Hub Transport Client Access Hub Transport
Client Access
(FSW)
(Alt FSW)
DAG
28. Best Practices for Site Resilient Solutions
Best practices include:
• Reduce failover time by using low TTL on DNS records
for the Client Access server array, Client Access server
URLs, and SMTP records
• Verify failover functionality with periodic testing
• Closely monitor replication health and other system
components to ensure failover health
• Follow proper change-management procedures
• Prevent cluster network cross-talk
29. Lab: Implementing High Availability
• Exercise 1: Deploying a DAG
• Exercise 2: Deploying Highly Available Hub Transport and
Client Access Servers
• Exercise 3: Testing the High Availability Configuration
Logon information
Estimated time: 60 minutes
30. Lab Scenario
You are the messaging administrator for A. Datum Corporation.
You have completed the basic installation for three Exchange
servers. Now you must complete the configuration so that they
are highly available.
31. Lab Review
• When might you choose to initiate a database switchover?
• If you deploy only two Hub Transport servers in an Active
Directory site, would shadow redundancy protect
messages between mailboxes in the same site?
32. Module Review and Takeaways
• Review Questions
• Common Issues and Troubleshooting Tips
• Real-World Issues and Scenarios
• Best Practices
Editor's Notes
Module 7: Implementing High Availability Course 10135B Presentation: 85 minutes Lab: 90 minutes After completing this module, students will be able to: Describe the high availability options Configure highly available mailbox databases Deploy highly available non-Mailbox servers Deploy high availability solution for multiple sites Required materials To teach this module, you need the Microsoft® Office PowerPoint® file 10135A_07.ppt. Important: We recommend that you use PowerPoint 2002 or a newer version to display the slides for this course. If you use PowerPoint Viewer or an earlier version of PowerPoint, all the features of the slides might not be displayed correctly. Preparation tasks To prepare for this module: Read all of the materials for this module. Practice performing the demonstrations and the lab exercises. Work through the Module Review and Takeaways section, and determine how you will use this section to reinforce student learning and promote knowledge transfer to on-the-job performance. Note about the demonstrations : To prepare for the demonstrations, start the 10135A-VAN-DC1 virtual machine and log on to the server before starting the other virtual machines. To save time during the demonstrations, log on to the Exchange servers and open the Exchange Server management tools before starting the demonstrations. Additionally, connect to the Microsoft Outlook® Web App site on the Exchange servers, and then log on as Administrator. It can take more than a minute to open the management tools and Outlook Web App for the first time.
Module 7: Implementing High Availability Course 10135B
Module 7: Implementing High Availability Course 10135B
Lead a discussion about high availability. Define high availability as the implementation of a system design that ensures a high level of operational continuity over a specific time period. Many students may attribute high availability to failover clustering or load balancing. However, you can achieve high availability only with good design, testing, training, and operational processes. Also mention that you measure availability by the percentage of time the application is available. Also be sure to note that planned downtime is usually exempted from these measurements. Planned downtime is downtime scheduled for routine proactive updates and maintenance. References Microsoft High Availability White Paper http://go.microsoft.com/fwlink/?LinkId=179979 Module 7: Implementing High Availability Course 10135B
Lead a discussion about the components that make up a high availability solution. Question: Which components are important for running a high availability solution? Answer: Be sure to capture these main points, but know that students answers may vary: Redundant hardware components Server hardware Network hardware Storage hardware Software components (clustering) Operational readiness Operational training Proactive maintenance Change management Detailed system health monitoring Question: What are common single points of failure in a messaging solution? Answer: Answers will vary. Some may include: Internet connectivity; server failures with hard drives, fans and power supplies; and environmental factors, such as power and cooling. Module 7: Implementing High Availability Course 10135B
Module 7: Implementing High Availability Course 10135B
Explain a database availability group (DAG) and how it leverages continuous transaction log replication to the passive database copies within the DAG. Point out that a DAG: Requires the failover clustering feature, although all installation and configuration tasks occur with the Exchange Server management tools . Although a DAG requires the failover clustering feature, Exchange Server does not use Windows® failover clustering to handle database failover. Instead, it uses Active Manager to control failover. Uses an enhanced version of the continuous replication technology that Exchange Server 2007 introduced . Exchange Server 2010 improves upon the best continuous replication pieces from Exchange Server 2007. Can be created after you have installed the Mailbox server . You can set up a Mailbox server to host active mailboxes, and then add to a DAG later. Allows a single database to be moved between servers in the group without affecting other databases . Failover clustering is done per mailbox database, not for an entire server, making Exchange Server 2010 more flexible than previous Exchange Server versions. Allows up to 16 copies of a single database on separate servers . You can add 16 servers to a DAG, which allows you to create 16 databases. Defines the boundary for replication since only servers within the DAG can host database copies . You cannot replicate database information to Mailbox servers outside the DAG. Question: During installation, do you need to decide if you want to join a server to a DAG? Answer: No, you can freely add and remove servers from a DAG after installation. This is a significant benefit over Exchange Server 2003 and Exchange Server 2007. Module 7: Implementing High Availability Course 10135B
Explain the concept of quorum and how node majority is reached. You can, for example, draw an example on a whiteboard with three servers and a witness server. Explain what happens if one node goes down. Basically the two servers together with the witness server vote will reach quorum and the cluster will mount. Then explain what happens if another node fails, leaving only one server and the witness server. They do not reach quorum and cannot mount automatically. You need to manually mount this cluster if needed. Module 7: Implementing High Availability Course 10135B
Discuss the role of the Active Manager. Mention that although the DAG relies on the failover clustering feature, it relies on Active Manager to do the failover or switchover processes. References Exchange Server 2010 Help: Understanding Mailbox Database Availability: Active Manager Module 7: Implementing High Availability Course 10135B
Explain the scenario first. We have a DAG that contains three servers, each with a database copy of DB1. The left DB1 (blue) is the active database, the other two databases are passive copies. Talk through the continuous replication–file mode process using the slide: Exchange Server writes, and then closes the active log. The Replication Service replicates the closed log to servers hosting the passive databases. Because each copy of the database is identical, Exchange Server inspects, and then replays or applies, the transaction logs to the database copies. The databases remain in sync. Mention that you must seed the databases for this process. Seeding a database ensures that a good copy of the database is available on each Exchange server, so that you can replay transaction logs as appropriate. Next describe continuous replication – block mode, which is new for Exchange Server 2010 Service Pack 1 (SP1). Explain that this mode is automatically triggered when Exchange Server 2010 recognizes that all log files have been copied correctly to the Exchange servers that host the passive databases. Then talk through the block mode process: Once in block mode, any block of data written to the ESE log buffer on the Exchange server that hosts the active database is automatically copied to the replication log buffer of all passive database copies. When the ESE log buffer is full, the final block is sent to the passive databases and a transactional log file is written on the Exchange server that hosts the active database. Then the ESE log buffer is emptied. When the Exchange servers hosting the passive databases receive the final block that fills up their replication log buffer, they also save the buffer to a transactional log file. After that the buffer is emptied and the process starts again. When the server with the active database fails, but the replication log buffer is not full yet, then the buffer is saved to a new transactional log file. Question: How do you configure continuous replication – block mode? Answer: Continuous replication – block mode is enabled by default. You do not need to configure it. Reference Understanding High Availability and Site Resilience http://go.microsoft.com/fwlink/?LinkID=212703 Module 7: Implementing High Availability Course 10135B
The active database copy uses continuous replication to keep the passive copies in sync, based on their lag-time setting. A DAG leverages the Windows Server® operating system failover clustering feature. However, it relies on the Active Manager server to maintain status of all of the databases that the DAG hosts. A single database can be switched or failed between servers within the DAG. At any given time, a copy is either the replication source or the replication target, but not both. A server may not host more than one copy of a given database. Not all databases need to have the same number of copies. In a 16-node DAG, one database can have 16 copies, while another database is not redundant and contains only the one active copy. Discuss the difference between a failover and a switchover. Question: Can you have one database copied three times, and another database copied only one time? Answer: Yes. Database copies are configured on a per-database basis, thus you can decide how many copies of a specific database you want to create, and on which Exchange servers they should be hosted. References Exchange Server 2010 Help: Understanding Mailbox Database Availability Module 7: Implementing High Availability Course 10135B
This topic discusses all the relevant DAG configuration settings that are available to configure a DAG. You should explain the details about each point and mention when to use it. For example, mention that you need to configure DAG Networks only if you have multiple sub-nets and the automatic discovery of these subnets was not working correctly. Module 7: Implementing High Availability Course 10135B
Discuss the process of configuring a database to be highly available within a DAG. This includes to describe and discuss the importance to configure truncation, replay lag times, and preferred list sequence number. Question: How do you plan to use the preferred list sequence number? Answer: Answers may vary. However, many students will prefer to spread the activity to multiple servers. Rotating the preference for the databases through all available servers allows each server to serve client requests actively. References Exchange Server 2010 Help: Add a Mailbox Database Copy Module 7: Implementing High Availability Course 10135B
Demonstrate how to create a DAG by adding at least two servers to the group, and then adding a copy of one of the databases. Discuss the prerequisites for configuring a DAG, such as Windows Server 2008 Enterprise, and that Microsoft recommends that the failover clustering feature be installed prior to creating a DAG. Preparation Ensure that the 10135A-DC1, 10135A-VAN-EX1, and 10135A-VAN-EX2 virtual machines are running. Log on to 10135A-VAN-EX1 as Administrator with the password of Pa$$w0rd . Demonstration Steps On VAN-EX1, click Start , click All Programs , click Microsoft Exchange Server 2010 , and then click Exchange Management Console . In the console tree, expand Microsoft Exchange On-Premises , expand Organization Configuration , and then click Mailbox . In the results pane, click the Database Availability Groups tab. On Actions pane, click New Database Availability Group . In the New Database Availability Group Wizard , type DAG1 in the Database availability group name field, click Witness Server , and then type VAN-DC1 in the host name field. Click Witness Directory , type C:\\FSWDAG1 in the path field, and then click New . In the New Database Availability Group Wizard , on Completion page, you can ignore the warning, and then click Finish . Note: In an actual business environment we recommend that you use the local Hub Transport server, not a domain controller, to act as the file share witness. A two-node DAG configuration requires a file share witness, since it requires a majority of votes at all times to maintain quorum. In a two-node cluster without a file share witness, when you reboot one of the nodes, a majority of votes cannot be obtained, so quorum is not maintained, and the cluster fails. You can specify the Hub Transport server and the local directory that you want to configure as the file share witness when you create a DAG. As a best practice, you should add the file share witness to other clusters too. Clusters with even numbers of nodes use the file share witness as a tiebreaker vote in establishing quorum. In the Work pane on the Database Availability Groups tab, right-click DAG1 , and then click Properties from the context menu. In the DAG1 properties window, click IP Addresses tab. In IP Addresses tab, click Add, and then type 10.10.0.25 in Database availability group IP addresses field. Click OK , and then click OK to leave DAG1 properties. In the Microsoft Exchange Warning window, click OK . Module 7: Implementing High Availability Course 10135B
In the Work pane on the Database Availability Groups tab, right-click DAG1 , and then click Manage Database Availability Group Membership from the context menu. In the Manage Database Availability Group Membership Wizard, click Add . In the Select Mailbox Server dialog box, click VAN-EX1 , and then click OK . In the Manage Database Availability Group Membership Wizard, click Add . In the Select Mailbox Server dialog box, click VAN-EX2 , and then click OK . In the Manage Database Availability Group Membership Wizard, click Manage to complete the changes, and then click Finish to close the wizard. In the Results pane, click the Database Management tab. In the Results pane, click Mailbox Database 1 , and then in the Actions pane, click Add Mailbox Database Copy . If you do not see Add Mailbox Database Copy on Actions pane, click Refresh . In the Add Mailbox Database Copy Wizard, click Browse to select the server to which to add the copy. In the Select Mailbox Server dialog box, click VAN-EX2 , and then click OK . In the Add Mailbox Database Copy Wizard, click Add to create the copy of Mailbox Database 1. Review the results, and then click Finish . Note Once you create a DAG, you then can create and configure DAG networks for replication or for MAPI traffic. Add additional networks for redundancy or improved throughput. Question : What information do you need before you can configure a DAG? Answer: At minimum, the administrator needs to know within which network the DAG will reside and the servers that will participate. Reference Exchange Server 2010 Help: Create a Database Availability Group, Managing Database Availability Group Membership, and Managing Database Availability Groups Module 7: Implementing High Availability Course 10135B
The transport dumpster was first introduced in Exchange Server 2007 to allow for recovery of database failovers of Clustered Continuous Replication (CCR) and Local Continuous Replication (LCR) availability solutions. Module 7: Implementing High Availability Course 10135B
Discuss what would happen if a failure occurred and not all of the logs could be copied to the passive copy before you need to make it active. Discuss how to use the transport dumpster when failover occurs. Active manager selects the best copy to activate, as follows: 1. Enumerates the available copies 2 Ignores all servers that are unreachable 3. Sorts available copies by how up-to-date they are (content index, copy queue length and replay queue length). Breaks ties based on activation preference Activity Draw a simple Exchange Server messaging environment on the board. Be sure to include one Hub Transport server and two Mailbox servers configured in a DAG. Now walk through a scenario in which a message is delivered through the Hub Transport server to the active database on the first Mailbox server. Just after Exchange Server delivers the message, the server fails, and no data is recovered. Now walk through the steps on the slide to discuss how this database failover process works. Discuss the AutoDatabaseMountDial setting and when to use it. The three possible values for the AutoDatabaseMountDial setting are: BestAvailability . If you specify this value, the database automatically mounts if the copy queue length is less than or equal to twelve. The copy queue length is defined as the number of logs that the passive copies recognize as needing to be replicated. If the copy queue length is more than twelve, the database will not automatically mount. When the copy queue length is less than or equal to twelve, Exchange Server attempts to replicate the remaining logs to the passive copies and mount the database. GoodAvailability . If you specify this value, the databases automatically mount immediately after a failover, if the copy queue length is less than or equal to six. If the copy queue length is more than six, the databases do not automatically mount. When the copy queue length is less than or equal to six, Exchange Server attempts to replicate the remaining logs to the passive copy and mount the database. Lossless . If you specify this value, the databases do not automatically mount until all logs that were generated on the active copy are copied to the passive copy. References Exchange Server 2010 Help: Set-MailboxServer Module 7: Implementing High Availability Course 10135B
You should only allocate permissions to those who are knowledgeable about DAGs, and are responsible for their maintenance. The ability to manage DAGs and database copies may not be required by most Exchange Server administrators. One of the unique problems in managing DAGs is that you may not notice the failure of a DAG member immediately, because users are not affected. To actively monitor DAG members, consider using Microsoft Systems Center Operations Manager 2012, or another monitoring tool. Question: Which users in your organization will have permission to manage DAGs? Answer: Answers will vary. In general, large organizations may have a small, dedicated group for managing the DAG-level components. Mid-sized organizations will most likely allow Exchange Server administrators to manage the DAG. Module 7: Implementing High Availability Course 10135B
Demonstrate how to use the Exchange Management Console and Exchange Management Shell to review the available information regarding database replication health. In the demonstration, show how to view the health status of the database copies in the Exchange Management Console or Exchange Management Shell. Preparation Ensure that the 10135A-DC1, 10135A-VAN-EX1 and 10135A-VAN-EX2 virtual machines are running. Log on to 10135A-VAN-EX1 as Administrator with the password of Pa$$w0rd . Demonstration Steps On VAN-EX1, click Start , click All Programs , click Microsoft Exchange Server 2010 , and then click Exchange Management Console . In the Console Tree, expand Microsoft Exchange On-Premises , expand Organization Configuration , and then click Mailbox . In the Results pane, click the Database Management tab. In the Results pane, click Mailbox Database 1 , and then in the Actions pane, in the bottom Mailbox Database 1 area, click Properties . Review the information on the General tab: The database status might be Healthy, Initializing, Failed, Mounted, Dismounted, Disconnected, Suspended and Failed, Suspended, Resynchronizing, Seeding. Describe Copy queue length (logs) and Replay queue length (logs) . Click OK to close. Click Start , click All Programs , click Microsoft Exchange Server 2010 , and then click Exchange Management Shell . At the Exchange Management Shell prompt, type Test-ReplicationHealth , and then press Enter. This cmdlet performs a variety of tests, and reports back the status for the replication components. At the Exchange Management Shell prompt, type Get-MailboxDatabaseCopyStatus , and then press Enter. This cmdlet shows status information about all database copies available on the local server. At the Exchange Management Shell prompt, type CD “C:\\Program Files\\Microsoft\\Exchange Server\\V14 At the Exchange Management Shell prompt, type .\\CheckDatabaseRedundancy.ps1 –MaiboxDatabaseName “Mailbox Database 1” , and then press Enter. Notice that the CurrentStatus should be Green to indicate that the all databases are healthy and the database is mounted on the server with the lowest Activation preference number (APN). Question : Why is monitoring these statistics important? Answer : High availability is more than just redundant software and hardware. It is a crucial tool for identifying and reacting to problems quickly and effectively. Monitoring the statistics can help you do this. Module 7: Implementing High Availability Course 10135B
Module 7: Implementing High Availability Course 10135B
Discuss creating a client access array, and that you should create an array of Client Access servers (using the New-ClientAccessArray cmdlet) in each site, and then load-balance using one of the methods listed on the slide. Question: What three load-balancing possibilities can you use for your Client Access servers? Answer: You can use hardware load balancing, software load balancing, or round-robin Domain Name System (DNS). Course 10165A Module 8: Implementing High Availability
Walk the students through a typical shadow redundancy scenario: Hub delivers message to Edge Hub opens a Simple Mail Transfer Protocol (SMTP) session with Edge. Edge advertises shadow redundancy support. Hub notifies Edge to track discard status. Hub submits message to Edge. Edge acknowledges the receipt of message, and records the Hub’s name for sending the message’s discard information. Hub moves the message to the shadow queue for Edge, and marks Edge as the primary server. Hub becomes the shadow server. Edge delivers message to next hop: Edge submits message to third-party mail server. Third party mail server acknowledges the message’s receipt. Edge updates the message’s discard status as delivery complete. Hub queries Edge for discard status (success case): At end of each SMTP session with Edge, Hub queries Edge for discard status on messages previously submitted. If Hub has not opened any SMTP sessions with Edge after the initial message submission, it will open an SMTP session with Edge just to query for discard status after a specific time. Edge checks local discard status, sends back the list of messages that have been delivered, and removes the discard information. Hub server deletes the list of messages from its shadow queue. Hub queries Edge for discard status and resubmits the message (failure case): If Hub cannot contact Edge, Hub resumes the primary server role, and resubmits the messages in the shadow queue. Resubmitted messages are delivered to another Edge server, and the workflow starts from step 1. Within Exchange Server 2010, the Shadow Redundancy Manager is the core component of a Transport server that is responsible for managing shadow redundancy. The Shadow Redundancy Manager is responsible for maintaining the following information for all the primary messages that a server is currently processing: The shadow server for each primary message being processed. The discard status to be sent to shadow servers. Explain what feature Shadow redundancy promotion is and that it is available since Exchange Server 2010 SP1. If time permits, allow the students to create other scenarios that might occur, and then use the board to walk through how shadow redundancy provides easy recovery. Question: On which Exchange 2010 server roles can you find the Shadow Redundancy Manager? Answer: You find the Shadow Redundancy Manager on the servers roles that route messages, such as Hub Transport server role and Edge Transport server role. Module 7: Implementing High Availability Course 10135B
Discuss the high availability options for the Edge Transport role. Be sure to mention why you might choose these options. Edge Transport also supports shadow redundancy, as mentioned in the previous slide, but shadow redundancy does not cover all scenarios because most of the messaging servers to which the Edge Transport role communicates do not support shadow redundancy. Acknowledge that you would use similar methods if you were using Hub Transports for inbound and outbound email. Module 7: Implementing High Availability Course 10135B
Module 7: Implementing High Availability Course 10135B
The design of DAGs allows you to add a Mailbox server from an alternate site. Then, you can add a database copy to the servers in the alternate sites as well. By doing this, the DAG is configured for site resilience, and no special configuration is required. However, network latency must be below 250 milliseconds round-trip time, otherwise the DAG members cannot communicate. The alternate site must have Hub Transport and Client Access servers available, but that is a basic requirement of installing a Mailbox server in that location. Those roles can be placed on the Mailbox server, if required. Datacenter Activation Coordination mode can be used for DAGs with three or more members extended to multiple sites. The Datacenter Activation Coordination mode prevents split-brain syndrome by not allowing automatic recovery to the main site even if quorum is achieved, because the remote site may be running the databases. Question: What is the maximum round trip network latency time of DAG members? Answer: The maximum is 250 milliseconds between each DAG member. Module 7: Implementing High Availability Course 10135B
Explain why Datacenter Activation Coordination mode was implemented in Exchange Server 2010. The scenario shown in the animation is a Datacenter Activation Coordination mode with a DAG spanning two Active Directory® directory service sites. Database 1 is mounted due to quorum. Talk the students through a site failover process with the animation, as follows: The primary data center and the wide area network (WAN) link fails. As the database was very important, you decide to bring the database online at the secondary data center. You use a manual process to mount the database. Once the servers at the primary data center are back online—the WAN connection is still down—no server will automatically mount a database, because their DACPs are 0 as they cannot reach the servers in the secondary data center. When the WAN link is recovered, the databases at the primary site “see” the databases at the secondary site, and begin to synchronize. You then can safely mount back the active database to the primary datacenter. Be sure to mention that Exchange Server 2010 SP1 supports Datacenter Activation Coordination mode with two-member DAGs that have each member in a separate data center. Question: When should you consider configuring the Datacenter Activation Coordination mode? Answer: You should consider the Datacenter Activation Coordination mode as soon as you configure DAG members located at two or more Active Directory sites. Reference Understanding Datacenter Activation Coordination Mode http://go.microsoft.com/fwlink/?LinkID=185451 Module 7: Implementing High Availability Course 10135B
Mention that site resilience requires the following server roles to be available in each site besides the Mailbox roles that you already configured in a DAG: Active Directory Domain Controller Hub Transport server Client Access server Explain that the DC; Hub Transport and Client Access servers do not require any special configuration to support site resiliency, but need to be available in the remote site. Question: When using Exchange Server 2010 SP1 or newer, do you need to update the RPCClientAccessServer property when a site failover occurs and your Client Access servers at the original site are not available anymore? Answer: No, if you configured cross-site direct connect. Module 7: Implementing High Availability Course 10135B
You must enable DAC mode for DAGs to allow for an administrator-activated site switchover. When the primary data center fails, the switchover process includes: DNS records are adjusted for Simple Mail Transfer Protocol (SMTP), Outlook Live, Autodiscover, Outlook Anywhere, and any legacy protocols (if necessary). You configure DAG to remove the primary site servers from the Windows Failover Cluster, but not to remove them from the DAG. You configure DAG to use an alternative file-share witness, and to restore the secondary site’s functionality. Remaining Active Managers coordinate mounting databases in secondary site. To switch back, replicate the database copy back to the original site, and then activate it in the original site. The activation does not need to be done during off-hours, but can be done to reduce the risk of a problem affecting users. Question: What is the basic requirement for switching the active database to a secondary data center? Answer: The DAG needs to be in Datacenter Activation Coordination mode. Module 7: Implementing High Availability Course 10135B
Discuss the following best practices that users should follow for multisite failover. Ask the students about the practices that they find useful, or that they expect to find useful. Use time-to-live (TTL) on DNS records for the Client Access server array, Client Access server URLs, and SMTP records, to reduce failover time. Having a low TTL enables the DNS clients to more quickly discover the DNS entries that point to the secondary site. Monitor all aspects of the Exchange Server environment to ensure that it is functioning normally, and that mailbox data is successfully being replicated in a timely manner to the secondary site. Also, talk about scheduling periodic failover tests to provide an additional level of preparation, and to provide validation to the configuration and operation of the cross-site failover process. Change management is important, and you should use it to ensure that each Mailbox server in the DAG, each Client Access server, and each Hub Transport server are configured identically and have the same updates applied. Doing this reduces the possibility of incompatibilities and/or the unexpected behavior that could occur after a failover. To reduce network congestion and potential communications problems, cluster heartbeat communications should not be allowed between these networks. Question: In your organization, do you regularly perform disaster recovery activities such as recovering a backup in Exchange Server? Answer: The answer depends on the organization. However, it is a best practice to regularly practice failover, backup, and recovery to ensure that Exchange Server 2010 performs as expected. Module 7: Implementing High Availability Course 10135B
In this lab, students will: Deploy a DAG Deploy highly available Hub Transport and Client Access servers Exercise 1 In this exercise, students will deploy a DAG and test continuous replication. Inputs: Students will be provided with the instructions on how to configure a DAG, and then how to configure multiple database copies. Outputs: Students will configure a DAG and configure multiple copies of a database between multiple servers. Students also will verify that continuous replication is working by performing a database switchover. Exercise 2 In this exercise, students will configure and test highly available Client Access and Hub Transport servers. Inputs: Students will be given instructions on how to make the Client Access and Hub Transport servers highly available. Outputs: Students will configure and verify network load balancing for Client Access servers, and verify that the Hub Transport servers are highly available. Exercise 3 In this exercise, students will verify that the high availability configuration is working properly. Inputs: Students will be give instructions on how to test the configuration for successful configuration. Outputs: Students will use the SMTP service and Queue Viewer to verify messages, and simulate a server failure. Before the students begin the lab, read the scenario associated with each exercise to the class. This will reinforce the broad issue that the students are troubleshooting and will help to facilitate the lab discussion at the module’s end. Remind the students to complete the discussion questions after the last lab exercise. Module 7: Implementing High Availability Course 10135B
Module 7: Implementing High Availability Course 10135B
Use the questions on the slide to guide the debriefing after students complete the lab exercises. Question: When might you choose to initiate a database switchover? Answer: You can initiate database switchovers to move databases off a DAG member for maintenance tasks, such as applying software updates. Question: If you deploy only two Hub Transport servers in an Active Directory site, would shadow redundancy protect messages between mailboxes in the same site? Answer: Shadow redundancy does not protect messages delivered within the same site, because the messages will not have traversed more than one Hub Transport server. However, you can recover these messages using the transport dumpster functionality. Module 7: Implementing High Availability Course 10135B
Review Questions Question: What are the requirements for using the Datacenter Activation Coordination mode? Answer: You require Exchange Server 2010 SP1 or newer, the DAG must have at least two members, and span at least two Active Directory sites. Question: Which Exchange Server 2010 feature provides fault tolerance for message delivery? Answer: Shadow redundancy is a new feature in Exchange Server 2010 that ensures messages are not lost when a Hub Transport server fails. The sending Hub Transport server keeps a copy of a message until the next hop reports the message as delivered. Question: In which scenarios might you use hardware load balancing with Edge Transport servers? Answer: In high utilization scenarios requiring hundreds of Edge Transport servers, it may make more sense to use a hardware load balancer than to create hundreds of DNS MX records. Doing this also may reduce the number of public IP addresses required. Question: Besides planning for Exchanger Server failures, what other failures should you consider? Answer: Exchange Server high availability configurations protect against software and server failures, and database corruption. It is important to consider larger issues, such as local network failures, Internet connectivity issues, and data center power and cooling failures. Question: In which scenarios might you use hardware load balancing with Edge Transport servers? Answer: In high utilization scenarios requiring hundreds of Edge Transport servers, it may make more sense to use a hardware load balancer than to create hundreds of DNS MX records. Doing this also may reduce the number of public IP addresses required. Question: To make a highly available Exchange Server 2010 organization, which components must be highly available? Answer : All components need to be highly available. If one component in the Exchange Server 2010 organization is not highly available, then the entire Exchange Server 2010 organization is not highly available. You need to consider high availability for all Exchange Server roles, supporting services, and infrastructure. Question: How many networks should you use for a DAG? Answer : You should use at least two networks for a DAG: the first network is dedicated to replication of transaction logs; the second network is primarily used for Messaging Application Programming Interface (MAPI) communication, but can also be used for replication if the replication network fails. Module 7: Implementing High Availability Course 10135B
Common Issues Related to Creating High Availability Edge Transport Solutions Identify the causes for the following common issues related to high availability Edge Transport servers, and fill in the troubleshooting tips. For answers, refer to relevant lessons in the module. Real-World Issues and Scenarios Question: An organization has several branch offices with a small number of employees. However, the organization needs to deploy a high availability solution in the remote offices. What configuration can it deploy to meet it business needs? Answer: It may be possible to deploy two servers and install the Mailbox, Hub Transport, and Client Access server roles on both. The organization can create a DAG and use a hardware load balancer to load balance client access connectivity. Question: An organization uses a variety of service-level agreements for database availability for different business units. It wants to minimize the number of mailbox servers it deploys. How can it do this? Answer: Deploy all Mailbox servers in a single DAG, and then configure each of the business unit’s mailbox databases with the appropriate number of copies to meet the service. Module 7: Implementing High Availability Course 10135B Issue Troubleshooting tip Inbound email is not being delivered evenly across all of the Edge Transport servers. Ensure that the DNS MX records have the same value. If the values are not the same, only the records with the lowest value will be used. After deploying highly available Edge Transport servers, outbound email is being returned as possible spam. Verify that your outbound mail servers are configured with a host name that is resolvable on the Internet. Many servers reject email from servers that do not have a name or an IP address that can be resolved on the Internet.
Best Practices Related to Designing a High Availability Solution Supplement or modify the following best practices for your own work situations: Identify all possible failure points before designing a solution. Even the most elaborate and expensive designs can have a simple and crippling failure point. Document all of the components to the solution so that everyone involved in the deployment understands how the solution is configured. Follow change-management procedures. In some environments, it may be tempting to skip these steps. However, not following proper change-management procedures often leads to extended, unplanned downtime. Use a client access array and load-balancing to make client access highly available. If a Client Access server is also a member of a DAG, then use hardware-based load-balancing. Ensure that Internet-accessible sites that proxy Client Access for multiple sites are highly available, because their outage will affect many users. When a mailbox database fails over to an alternate site for a short period of time, allow the clients to continue using the client access array in the original site. Module 7: Implementing High Availability Course 10135B