SlideShare a Scribd company logo
1 of 32
Module 7
Implementing High
       Availability
Module Overview
• Overview of High Availability Options

• Configuring Highly Available Mailbox Databases

• Deploying Highly Available Non-Mailbox Servers

• Deploying High Availability with Site Resilience
Lesson 1: Overview of High Availability Options
• What Is High Availability?

• Discussion: Components of a High Availability Solution
What Is High Availability?

 High availability:
  • Implements system design that ensures a high level of
    operational continuity
  • Is measured by the percentage of time the application is
    available




   Availability Target       Permitted Annual Downtime

           99%                     87 hours, 36 minutes
          99.9%                   8 hours, 46 minutes

         99.99%                  52 minutes, 34 seconds

        99.999%                  5 minutes, 15 seconds
Discussion: Components of a High Availability
Solution
• Which components are important for running a high
 availability solution?
• What are some single points of failure in a messaging
 solution?
Lesson 2: Configuring Highly Available Mailbox
Databases
• What Is a Database Availability Group?

• What is Quorum?

• What Is Active Manager?

• What Is Continuous Replication?

• How Are Databases Protected in a DAG?

• Configuring a Database Availability Group

• Configuring Databases for High Availability

• Demonstration: How to Create and Configure a DAG

• What Is the Transport Dumpster?

• Understanding the Failover Process

• Designing Monitoring and Management for a DAG

• Demonstration: How to Monitor Replication Health
What Is a Database Availability Group?

A DAG is a collection of servers that provides the infrastructure
for replicating and activating database copies. DAGs:


 • Require the failover clustering feature, although all
   installation and configuration is done with the Exchange
   Server management tools
 • Use Active Manager to control failover
 • Use an enhanced version of the continuous replication
   technology that Exchange Server 2007 introduced
 • Can be created after the Mailbox server is installed
 • Allow a single database to be activated on another server
   in the group without affecting other databases
 • Allow up to 16 copies of a single database on separate
   servers
 • Define the boundary for replication
What Is Quorum?

 Quorum defines consensus that enough cluster members
 are available to provide services


Exchange Server 2010 DAG quorums:

 • Are based on votes in Windows Server 2008

 • Allow nodes, file shares, and shared disks to have votes,
   depending on the quorum mode

 • Use node majority with a witness server for quorum:
     • DAGs with an even number of Mailbox servers use the
       witness server
     • DAGs with an odd number of Mailbox servers use node
       majority
What Is Active Manager?

Active Manager:


 • Runs a process on each server in the DAG
    • One node is the PAM
    • Remaining nodes are SAM
 • Manages which database copies are active and which are
   passive
 • Stores database state information
 • Manages database switchover and failover processes
 • Does not require direct administration configuration
What Is Continuous Replication?
                                    Database Availability Group



             DB1                                  DB1                        DB1




File Mode


Block Mode




                                                                               Replication Log Buffer
                                                    Replication Log Buffer
                   ESE Log Buffer
How Are Databases Protected in a DAG?
     Continuous replication protects databases across
     servers in the DAG




       DB1                DB2                 DB2




        DB2               DB3                 DB3




       DB4                 DB4                DB4
Configuring Database Availability Group
To configure DAGs you must define the following:
  • Witness Server – Server used to store witness
    information
  • Witness Directory – directory used on the witness server
    to store witness information
  • Database availability group IP addresses – IP
    address(es) used by DAG

Additionally consider these settings for larger or multi-site
implementations:
  • DAG Networks including replication
  • DAG Network Compression
  • DAG Network Encryption
  • Third-Party Replication Mode
  • Alternative Witness Server
  • Alternative Witness Directory
Configuring Databases for High Availability


After creating a DAG, adding Mailbox servers to the DAG, and
configuring the DAG, you must still do the following:

   • Create database copies


   • Set truncation lag time


  • Set replay lag time


  • Set preferred list sequence number
Demonstration: How to Create and Configure a
DAG
In this demonstration, you will see how to create and
configure a DAG
What Is the Transport Dumpster?

The transport dumpster:

  • Protects against Mailbox server failures when
    transaction logs have been lost
  • Keeps copies of all messages delivered in the transport
    queue (mail.que) until the transaction logs have
    replicated to all servers in the DAG, or until the
    maximum dumpster size is reached
  • Redelivers missing email messages when a failure
    occurs
Understanding the Failover Process
If a failure occurs, the following steps occur for the failed
database:
      Active Manager determines the best copy to activate

     The replication service on the target server attempts to
     copy missing log files from the best “source”:
      • If successful, the database mounts with zero data loss
      • If unsuccessful (failover), the database mounts based
        on the AutoDatabaseMountDial setting

      The mounted database generates new log files (using
      the same log generation sequence)

      Transport dumpster requests are initiated for the
      mounted database to recover lost messages

      When original server or database recovers, it
      determines if any logs are missing or corrupt, and fixes
      them if possible
Designing Monitoring and Management for a DAG



  • Allocate the necessary permissions for managing
    a DAG
   •   Organization Management
   •   DAGs
   •   Database copies

  • Failure may not be noticed

  • Exchange Server 2010 SP1 or newer includes
    several scripts and commands for DAG monitoring
    and management

  • Consider using System Center Operations
    Manager 2012
Demonstration: How to Monitor Replication Health
In this demonstration, you will see how to:
• Monitor replication health using the Exchange
 Management Console and the Exchange Management
 Shell
• View various status messages

• View available statistics
Lesson 3: Deploying Highly Available Non-Mailbox
Servers
• How High Availability Works for Client Access Servers

• How Shadow Redundancy Provides High Availability for
 Hub Transport Servers
• How High Availability Works for Edge Transport Servers
How High Availability Works for Client Access
Servers
A client access array is created with multiple Client Access
servers. You can achieve high availability and load balancing by
using one of these methods:

  • Software-based NLB
  • Hardware-based NLB
  • Round-robin DNS


To configure a client access array:
 • Use the New-ClientAccessArray cmdlet

 • Configure existing databases using the Set-
   MailboxDatabase cmdlet with the RpcClientAccess
   parameter

 • Configure internal URIs for Exchange services
How Shadow Redundancy Provides High
Availability for Hub Transport Servers
Transport server delays message deletion until it verifies that
the message has been delivered past the next hop

                                         s
                                       tu
                                 d sta
                              ar                     2.
                        d isc                           De
                                                          liv
                   ry                                        er
              Q ue
           3.                                                   to
                                                                   ne
                                        e1                           xt
                                   Ed g                                   ho
                              r to                                           p
                         elive             Edge1
                    1. D



            4.
               If
                  fa
                     ilu
        Hub             re
                          ,r
                            es
                              ub                                                  External
                                m
                                  it                                             SMTP Mail
                                                                                   Server




                                             Edge2
How High Availability Works for Edge Transport
Servers

Load balancing and high availability methods for Edge Transport
include:

 • Multiple DNS MX records that are created to specify
   multiple authoritative SMTP servers for the domain.
                                               domain
 • Hardware-based load balancing that is used to load
   balance inbound SMTP connections to any available Edge
   Transport server.
             server
Lesson 4: Deploying High Availability with Site
Resilience
• Requirements for Creating a Multiple Site DAG

• What Is Datacenter Activation Coordination Mode?

• Deploying Exchange 2010 for Site Resilience

• Switchover and Switchback Process with Site Resilience

• Best Practices for Site Resilient Solutions
Requirements for Creating a Multiple Site DAG

 Requirements include:

  • At least one Mailbox server in each site

  • Round-trip network latency time of maximum 500
    milliseconds between DAG members

  • Other server roles must be available in each site

  • DAC mode for DAGs that span multiple locations
     • Prevents split-brain syndrome
What Is Datacenter Activation Coordination
Mode?

 DAC mode:

  • Prevents split-brain syndrome
  • Uses the DACP Protocol to decide if a database can be
    mounted
     • 0 : Database cannot be mounted
     • 1 : Database can be mounted


      No DACP=1, no
     database mounted

                                                    Mount
                                                   database


  DACP=1
  DACP=0       DACP=1                   DACP=1
                DACP=1
                DACP=0
     Data center       DAG in DAC       Data center
                                        Data center
      Data center      DAG in DAC            2
          11             Mode
                         Mode                2
Deploying Exchange 2010 for Site Resilience

 Site resiliency:
  • Requires the following server roles to be
    available in each site (besides the Mailbox role):
     • Active Directory Domain Controller
     • Hub Transport server
     • Client Access server

  • Does not require any special configuration for
    Hub Transport and Client Access servers

  • The Edge Transport server:
     • Requires Internet connectivity for the
       alternate data center
     • Requires multiple MX records for incoming
       messages
Switchover and Switchback Process with Site
Resilience



           Site A                                Site B




                Hub Transport         Client Access   Hub Transport
Client Access
                   (FSW)
                                                          (Alt FSW)

                                DAG
Best Practices for Site Resilient Solutions

Best practices include:

 • Reduce failover time by using low TTL on DNS records
   for the Client Access server array, Client Access server
   URLs, and SMTP records


 • Verify failover functionality with periodic testing

 • Closely monitor replication health and other system
   components to ensure failover health


 • Follow proper change-management procedures


 • Prevent cluster network cross-talk
Lab: Implementing High Availability
• Exercise 1: Deploying a DAG

• Exercise 2: Deploying Highly Available Hub Transport and
 Client Access Servers
• Exercise 3: Testing the High Availability Configuration




Logon information




Estimated time: 60 minutes
Lab Scenario
You are the messaging administrator for A. Datum Corporation.
You have completed the basic installation for three Exchange
servers. Now you must complete the configuration so that they
are highly available.
Lab Review
• When might you choose to initiate a database switchover?

• If you deploy only two Hub Transport servers in an Active
 Directory site, would shadow redundancy protect
 messages between mailboxes in the same site?
Module Review and Takeaways
• Review Questions

• Common Issues and Troubleshooting Tips

• Real-World Issues and Scenarios

• Best Practices

More Related Content

What's hot

Setting up a big data platform at kelkoo
Setting up a big data platform at kelkooSetting up a big data platform at kelkoo
Setting up a big data platform at kelkooFabrice dos Santos
 
Hadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenesHadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenesNitin Khattar
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introductionacogoluegnes
 
Hadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHanborq Inc.
 
Ch02 installing exchange
Ch02 installing exchangeCh02 installing exchange
Ch02 installing exchangeShane Flooks
 
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsHadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsCloudera, Inc.
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentYahoo Developer Network
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messagesyarapavan
 
Storage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesStorage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesLINE Corporation (Tech Unit)
 
Logging Last Resource Optimization for Distributed Transactions in Oracle…
Logging Last Resource Optimization for Distributed Transactions in  Oracle…Logging Last Resource Optimization for Distributed Transactions in  Oracle…
Logging Last Resource Optimization for Distributed Transactions in Oracle…Gera Shegalov
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfsTrendProgContest13
 
Oracle Real Application Cluster ( RAC )
Oracle Real Application Cluster ( RAC )Oracle Real Application Cluster ( RAC )
Oracle Real Application Cluster ( RAC )varasteh65
 
Logging Last Resource Optimization for Distributed Transactions in Oracle We...
Logging Last Resource Optimization for Distributed Transactions in  Oracle We...Logging Last Resource Optimization for Distributed Transactions in  Oracle We...
Logging Last Resource Optimization for Distributed Transactions in Oracle We...Gera Shegalov
 
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010CLOUDIAN KK
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceUday Vakalapudi
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012StampedeCon
 
IT Camp - Server Migration Overview
IT Camp - Server Migration OverviewIT Camp - Server Migration Overview
IT Camp - Server Migration OverviewHarold Wong
 

What's hot (20)

Ch03 cas
Ch03 casCh03 cas
Ch03 cas
 
Setting up a big data platform at kelkoo
Setting up a big data platform at kelkooSetting up a big data platform at kelkoo
Setting up a big data platform at kelkoo
 
Hadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenesHadoop Distributed File System(HDFS) : Behind the scenes
Hadoop Distributed File System(HDFS) : Behind the scenes
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
 
Hadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHadoop HDFS NameNode HA
Hadoop HDFS NameNode HA
 
Ch02 installing exchange
Ch02 installing exchangeCh02 installing exchange
Ch02 installing exchange
 
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsHadoop Summit 2012 | HBase Consistency and Performance Improvements
Hadoop Summit 2012 | HBase Consistency and Performance Improvements
 
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and DeploymentOct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
Oct 2012 HUG: Hadoop .Next (0.23) - Customer Impact and Deployment
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messages
 
Storage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesStorage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messages
 
Logging Last Resource Optimization for Distributed Transactions in Oracle…
Logging Last Resource Optimization for Distributed Transactions in  Oracle…Logging Last Resource Optimization for Distributed Transactions in  Oracle…
Logging Last Resource Optimization for Distributed Transactions in Oracle…
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfs
 
Oracle Real Application Cluster ( RAC )
Oracle Real Application Cluster ( RAC )Oracle Real Application Cluster ( RAC )
Oracle Real Application Cluster ( RAC )
 
Logging Last Resource Optimization for Distributed Transactions in Oracle We...
Logging Last Resource Optimization for Distributed Transactions in  Oracle We...Logging Last Resource Optimization for Distributed Transactions in  Oracle We...
Logging Last Resource Optimization for Distributed Transactions in Oracle We...
 
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012
 
IT Camp - Server Migration Overview
IT Camp - Server Migration OverviewIT Camp - Server Migration Overview
IT Camp - Server Migration Overview
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
 

Viewers also liked

Viewers also liked (7)

10135 a 09
10135 a 0910135 a 09
10135 a 09
 
50357 a enu-labmanual01
50357 a enu-labmanual0150357 a enu-labmanual01
50357 a enu-labmanual01
 
10135 b 03
10135 b 0310135 b 03
10135 b 03
 
10135 a 07
10135 a 0710135 a 07
10135 a 07
 
10135 a xa
10135 a xa10135 a xa
10135 a xa
 
10135 b 13
10135 b 1310135 b 13
10135 b 13
 
10135 a 00
10135 a 0010135 a 00
10135 a 00
 

Similar to 10135 b 07

Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2Osama Mohammed
 
Data Diffing Based Software Architecture Patterns
Data Diffing Based Software Architecture PatternsData Diffing Based Software Architecture Patterns
Data Diffing Based Software Architecture PatternsHuahai Yang
 
Ch05 high availability
Ch05 high availabilityCh05 high availability
Ch05 high availabilityShane Flooks
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inRahulBhole12
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scalejimriecken
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentSpeedment, Inc.
 
Domino Server Health - Monitoring and Managing
 Domino Server Health - Monitoring and Managing Domino Server Health - Monitoring and Managing
Domino Server Health - Monitoring and ManagingGabriella Davis
 
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...Microsoft Technet France
 
Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Perforce
 
Scott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceScott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceNordic Infrastructure Conference
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio, Inc.
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overviewhowie YU
 

Similar to 10135 b 07 (20)

10135 b 11
10135 b 1110135 b 11
10135 b 11
 
Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2
 
Data Diffing Based Software Architecture Patterns
Data Diffing Based Software Architecture PatternsData Diffing Based Software Architecture Patterns
Data Diffing Based Software Architecture Patterns
 
6421 b Module-11
6421 b Module-116421 b Module-11
6421 b Module-11
 
Ch05 high availability
Ch05 high availabilityCh05 high availability
Ch05 high availability
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
NYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ SpeedmentNYJavaSIG - Big Data Microservices w/ Speedment
NYJavaSIG - Big Data Microservices w/ Speedment
 
Domino Server Health - Monitoring and Managing
 Domino Server Health - Monitoring and Managing Domino Server Health - Monitoring and Managing
Domino Server Health - Monitoring and Managing
 
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Google file system
Google file systemGoogle file system
Google file system
 
Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale
 
Scaling tappsi
Scaling tappsiScaling tappsi
Scaling tappsi
 
20345-1B_02.pptx
20345-1B_02.pptx20345-1B_02.pptx
20345-1B_02.pptx
 
Scott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceScott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilience
 
10135 b 08
10135 b 0810135 b 08
10135 b 08
 
Alluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata ServicesAlluxio - Scalable Filesystem Metadata Services
Alluxio - Scalable Filesystem Metadata Services
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Cloud stack overview
Cloud stack overviewCloud stack overview
Cloud stack overview
 

More from Wichien Saisorn (9)

10135 b 12
10135 b 1210135 b 12
10135 b 12
 
10135 b 10
10135 b 1010135 b 10
10135 b 10
 
10135 b 09
10135 b 0910135 b 09
10135 b 09
 
10135 b 06
10135 b 0610135 b 06
10135 b 06
 
10135 b 05
10135 b 0510135 b 05
10135 b 05
 
10135 b 04
10135 b 0410135 b 04
10135 b 04
 
10135 b 01
10135 b 0110135 b 01
10135 b 01
 
10135 b 00
10135 b 0010135 b 00
10135 b 00
 
10135 b xa
10135 b xa10135 b xa
10135 b xa
 

10135 b 07

  • 2. Module Overview • Overview of High Availability Options • Configuring Highly Available Mailbox Databases • Deploying Highly Available Non-Mailbox Servers • Deploying High Availability with Site Resilience
  • 3. Lesson 1: Overview of High Availability Options • What Is High Availability? • Discussion: Components of a High Availability Solution
  • 4. What Is High Availability? High availability: • Implements system design that ensures a high level of operational continuity • Is measured by the percentage of time the application is available Availability Target Permitted Annual Downtime 99% 87 hours, 36 minutes 99.9% 8 hours, 46 minutes 99.99% 52 minutes, 34 seconds 99.999% 5 minutes, 15 seconds
  • 5. Discussion: Components of a High Availability Solution • Which components are important for running a high availability solution? • What are some single points of failure in a messaging solution?
  • 6. Lesson 2: Configuring Highly Available Mailbox Databases • What Is a Database Availability Group? • What is Quorum? • What Is Active Manager? • What Is Continuous Replication? • How Are Databases Protected in a DAG? • Configuring a Database Availability Group • Configuring Databases for High Availability • Demonstration: How to Create and Configure a DAG • What Is the Transport Dumpster? • Understanding the Failover Process • Designing Monitoring and Management for a DAG • Demonstration: How to Monitor Replication Health
  • 7. What Is a Database Availability Group? A DAG is a collection of servers that provides the infrastructure for replicating and activating database copies. DAGs: • Require the failover clustering feature, although all installation and configuration is done with the Exchange Server management tools • Use Active Manager to control failover • Use an enhanced version of the continuous replication technology that Exchange Server 2007 introduced • Can be created after the Mailbox server is installed • Allow a single database to be activated on another server in the group without affecting other databases • Allow up to 16 copies of a single database on separate servers • Define the boundary for replication
  • 8. What Is Quorum? Quorum defines consensus that enough cluster members are available to provide services Exchange Server 2010 DAG quorums: • Are based on votes in Windows Server 2008 • Allow nodes, file shares, and shared disks to have votes, depending on the quorum mode • Use node majority with a witness server for quorum: • DAGs with an even number of Mailbox servers use the witness server • DAGs with an odd number of Mailbox servers use node majority
  • 9. What Is Active Manager? Active Manager: • Runs a process on each server in the DAG • One node is the PAM • Remaining nodes are SAM • Manages which database copies are active and which are passive • Stores database state information • Manages database switchover and failover processes • Does not require direct administration configuration
  • 10. What Is Continuous Replication? Database Availability Group DB1 DB1 DB1 File Mode Block Mode Replication Log Buffer Replication Log Buffer ESE Log Buffer
  • 11. How Are Databases Protected in a DAG? Continuous replication protects databases across servers in the DAG DB1 DB2 DB2 DB2 DB3 DB3 DB4 DB4 DB4
  • 12. Configuring Database Availability Group To configure DAGs you must define the following: • Witness Server – Server used to store witness information • Witness Directory – directory used on the witness server to store witness information • Database availability group IP addresses – IP address(es) used by DAG Additionally consider these settings for larger or multi-site implementations: • DAG Networks including replication • DAG Network Compression • DAG Network Encryption • Third-Party Replication Mode • Alternative Witness Server • Alternative Witness Directory
  • 13. Configuring Databases for High Availability After creating a DAG, adding Mailbox servers to the DAG, and configuring the DAG, you must still do the following: • Create database copies • Set truncation lag time • Set replay lag time • Set preferred list sequence number
  • 14. Demonstration: How to Create and Configure a DAG In this demonstration, you will see how to create and configure a DAG
  • 15. What Is the Transport Dumpster? The transport dumpster: • Protects against Mailbox server failures when transaction logs have been lost • Keeps copies of all messages delivered in the transport queue (mail.que) until the transaction logs have replicated to all servers in the DAG, or until the maximum dumpster size is reached • Redelivers missing email messages when a failure occurs
  • 16. Understanding the Failover Process If a failure occurs, the following steps occur for the failed database: Active Manager determines the best copy to activate The replication service on the target server attempts to copy missing log files from the best “source”: • If successful, the database mounts with zero data loss • If unsuccessful (failover), the database mounts based on the AutoDatabaseMountDial setting The mounted database generates new log files (using the same log generation sequence) Transport dumpster requests are initiated for the mounted database to recover lost messages When original server or database recovers, it determines if any logs are missing or corrupt, and fixes them if possible
  • 17. Designing Monitoring and Management for a DAG • Allocate the necessary permissions for managing a DAG • Organization Management • DAGs • Database copies • Failure may not be noticed • Exchange Server 2010 SP1 or newer includes several scripts and commands for DAG monitoring and management • Consider using System Center Operations Manager 2012
  • 18. Demonstration: How to Monitor Replication Health In this demonstration, you will see how to: • Monitor replication health using the Exchange Management Console and the Exchange Management Shell • View various status messages • View available statistics
  • 19. Lesson 3: Deploying Highly Available Non-Mailbox Servers • How High Availability Works for Client Access Servers • How Shadow Redundancy Provides High Availability for Hub Transport Servers • How High Availability Works for Edge Transport Servers
  • 20. How High Availability Works for Client Access Servers A client access array is created with multiple Client Access servers. You can achieve high availability and load balancing by using one of these methods: • Software-based NLB • Hardware-based NLB • Round-robin DNS To configure a client access array: • Use the New-ClientAccessArray cmdlet • Configure existing databases using the Set- MailboxDatabase cmdlet with the RpcClientAccess parameter • Configure internal URIs for Exchange services
  • 21. How Shadow Redundancy Provides High Availability for Hub Transport Servers Transport server delays message deletion until it verifies that the message has been delivered past the next hop s tu d sta ar 2. d isc De liv ry er Q ue 3. to ne e1 xt Ed g ho r to p elive Edge1 1. D 4. If fa ilu Hub re ,r es ub External m it SMTP Mail Server Edge2
  • 22. How High Availability Works for Edge Transport Servers Load balancing and high availability methods for Edge Transport include: • Multiple DNS MX records that are created to specify multiple authoritative SMTP servers for the domain. domain • Hardware-based load balancing that is used to load balance inbound SMTP connections to any available Edge Transport server. server
  • 23. Lesson 4: Deploying High Availability with Site Resilience • Requirements for Creating a Multiple Site DAG • What Is Datacenter Activation Coordination Mode? • Deploying Exchange 2010 for Site Resilience • Switchover and Switchback Process with Site Resilience • Best Practices for Site Resilient Solutions
  • 24. Requirements for Creating a Multiple Site DAG Requirements include: • At least one Mailbox server in each site • Round-trip network latency time of maximum 500 milliseconds between DAG members • Other server roles must be available in each site • DAC mode for DAGs that span multiple locations • Prevents split-brain syndrome
  • 25. What Is Datacenter Activation Coordination Mode? DAC mode: • Prevents split-brain syndrome • Uses the DACP Protocol to decide if a database can be mounted • 0 : Database cannot be mounted • 1 : Database can be mounted No DACP=1, no database mounted Mount database DACP=1 DACP=0 DACP=1 DACP=1 DACP=1 DACP=0 Data center DAG in DAC Data center Data center Data center DAG in DAC 2 11 Mode Mode 2
  • 26. Deploying Exchange 2010 for Site Resilience Site resiliency: • Requires the following server roles to be available in each site (besides the Mailbox role): • Active Directory Domain Controller • Hub Transport server • Client Access server • Does not require any special configuration for Hub Transport and Client Access servers • The Edge Transport server: • Requires Internet connectivity for the alternate data center • Requires multiple MX records for incoming messages
  • 27. Switchover and Switchback Process with Site Resilience Site A Site B Hub Transport Client Access Hub Transport Client Access (FSW) (Alt FSW) DAG
  • 28. Best Practices for Site Resilient Solutions Best practices include: • Reduce failover time by using low TTL on DNS records for the Client Access server array, Client Access server URLs, and SMTP records • Verify failover functionality with periodic testing • Closely monitor replication health and other system components to ensure failover health • Follow proper change-management procedures • Prevent cluster network cross-talk
  • 29. Lab: Implementing High Availability • Exercise 1: Deploying a DAG • Exercise 2: Deploying Highly Available Hub Transport and Client Access Servers • Exercise 3: Testing the High Availability Configuration Logon information Estimated time: 60 minutes
  • 30. Lab Scenario You are the messaging administrator for A. Datum Corporation. You have completed the basic installation for three Exchange servers. Now you must complete the configuration so that they are highly available.
  • 31. Lab Review • When might you choose to initiate a database switchover? • If you deploy only two Hub Transport servers in an Active Directory site, would shadow redundancy protect messages between mailboxes in the same site?
  • 32. Module Review and Takeaways • Review Questions • Common Issues and Troubleshooting Tips • Real-World Issues and Scenarios • Best Practices

Editor's Notes

  1. Module 7: Implementing High Availability Course 10135B Presentation: 85 minutes Lab: 90 minutes After completing this module, students will be able to: Describe the high availability options Configure highly available mailbox databases Deploy highly available non-Mailbox servers Deploy high availability solution for multiple sites Required materials To teach this module, you need the Microsoft® Office PowerPoint® file 10135A_07.ppt. Important: We recommend that you use PowerPoint 2002 or a newer version to display the slides for this course. If you use PowerPoint Viewer or an earlier version of PowerPoint, all the features of the slides might not be displayed correctly. Preparation tasks To prepare for this module: Read all of the materials for this module. Practice performing the demonstrations and the lab exercises. Work through the Module Review and Takeaways section, and determine how you will use this section to reinforce student learning and promote knowledge transfer to on-the-job performance. Note about the demonstrations : To prepare for the demonstrations, start the 10135A-VAN-DC1 virtual machine and log on to the server before starting the other virtual machines. To save time during the demonstrations, log on to the Exchange servers and open the Exchange Server management tools before starting the demonstrations. Additionally, connect to the Microsoft Outlook® Web App site on the Exchange servers, and then log on as Administrator. It can take more than a minute to open the management tools and Outlook Web App for the first time.
  2. Module 7: Implementing High Availability Course 10135B
  3. Module 7: Implementing High Availability Course 10135B
  4. Lead a discussion about high availability. Define high availability as the implementation of a system design that ensures a high level of operational continuity over a specific time period. Many students may attribute high availability to failover clustering or load balancing. However, you can achieve high availability only with good design, testing, training, and operational processes. Also mention that you measure availability by the percentage of time the application is available. Also be sure to note that planned downtime is usually exempted from these measurements. Planned downtime is downtime scheduled for routine proactive updates and maintenance. References Microsoft High Availability White Paper http://go.microsoft.com/fwlink/?LinkId=179979 Module 7: Implementing High Availability Course 10135B
  5. Lead a discussion about the components that make up a high availability solution. Question: Which components are important for running a high availability solution? Answer: Be sure to capture these main points, but know that students answers may vary: Redundant hardware components Server hardware Network hardware Storage hardware Software components (clustering) Operational readiness Operational training Proactive maintenance Change management Detailed system health monitoring Question: What are common single points of failure in a messaging solution? Answer: Answers will vary. Some may include: Internet connectivity; server failures with hard drives, fans and power supplies; and environmental factors, such as power and cooling. Module 7: Implementing High Availability Course 10135B
  6. Module 7: Implementing High Availability Course 10135B
  7. Explain a database availability group (DAG) and how it leverages continuous transaction log replication to the passive database copies within the DAG. Point out that a DAG: Requires the failover clustering feature, although all installation and configuration tasks occur with the Exchange Server management tools . Although a DAG requires the failover clustering feature, Exchange Server does not use Windows® failover clustering to handle database failover. Instead, it uses Active Manager to control failover. Uses an enhanced version of the continuous replication technology that Exchange Server 2007 introduced . Exchange Server 2010 improves upon the best continuous replication pieces from Exchange Server 2007. Can be created after you have installed the Mailbox server . You can set up a Mailbox server to host active mailboxes, and then add to a DAG later. Allows a single database to be moved between servers in the group without affecting other databases . Failover clustering is done per mailbox database, not for an entire server, making Exchange Server 2010 more flexible than previous Exchange Server versions. Allows up to 16 copies of a single database on separate servers . You can add 16 servers to a DAG, which allows you to create 16 databases. Defines the boundary for replication since only servers within the DAG can host database copies . You cannot replicate database information to Mailbox servers outside the DAG. Question: During installation, do you need to decide if you want to join a server to a DAG? Answer: No, you can freely add and remove servers from a DAG after installation. This is a significant benefit over Exchange Server 2003 and Exchange Server 2007. Module 7: Implementing High Availability Course 10135B
  8. Explain the concept of quorum and how node majority is reached. You can, for example, draw an example on a whiteboard with three servers and a witness server. Explain what happens if one node goes down. Basically the two servers together with the witness server vote will reach quorum and the cluster will mount. Then explain what happens if another node fails, leaving only one server and the witness server. They do not reach quorum and cannot mount automatically. You need to manually mount this cluster if needed. Module 7: Implementing High Availability Course 10135B
  9. Discuss the role of the Active Manager. Mention that although the DAG relies on the failover clustering feature, it relies on Active Manager to do the failover or switchover processes. References Exchange Server 2010 Help: Understanding Mailbox Database Availability: Active Manager Module 7: Implementing High Availability Course 10135B
  10. Explain the scenario first. We have a DAG that contains three servers, each with a database copy of DB1. The left DB1 (blue) is the active database, the other two databases are passive copies. Talk through the continuous replication–file mode process using the slide: Exchange Server writes, and then closes the active log. The Replication Service replicates the closed log to servers hosting the passive databases. Because each copy of the database is identical, Exchange Server inspects, and then replays or applies, the transaction logs to the database copies. The databases remain in sync. Mention that you must seed the databases for this process. Seeding a database ensures that a good copy of the database is available on each Exchange server, so that you can replay transaction logs as appropriate. Next describe continuous replication – block mode, which is new for Exchange Server 2010 Service Pack 1 (SP1). Explain that this mode is automatically triggered when Exchange Server 2010 recognizes that all log files have been copied correctly to the Exchange servers that host the passive databases. Then talk through the block mode process: Once in block mode, any block of data written to the ESE log buffer on the Exchange server that hosts the active database is automatically copied to the replication log buffer of all passive database copies. When the ESE log buffer is full, the final block is sent to the passive databases and a transactional log file is written on the Exchange server that hosts the active database. Then the ESE log buffer is emptied. When the Exchange servers hosting the passive databases receive the final block that fills up their replication log buffer, they also save the buffer to a transactional log file. After that the buffer is emptied and the process starts again. When the server with the active database fails, but the replication log buffer is not full yet, then the buffer is saved to a new transactional log file. Question: How do you configure continuous replication – block mode? Answer: Continuous replication – block mode is enabled by default. You do not need to configure it. Reference Understanding High Availability and Site Resilience http://go.microsoft.com/fwlink/?LinkID=212703 Module 7: Implementing High Availability Course 10135B
  11. The active database copy uses continuous replication to keep the passive copies in sync, based on their lag-time setting. A DAG leverages the Windows Server® operating system failover clustering feature. However, it relies on the Active Manager server to maintain status of all of the databases that the DAG hosts. A single database can be switched or failed between servers within the DAG. At any given time, a copy is either the replication source or the replication target, but not both. A server may not host more than one copy of a given database. Not all databases need to have the same number of copies. In a 16-node DAG, one database can have 16 copies, while another database is not redundant and contains only the one active copy. Discuss the difference between a failover and a switchover. Question: Can you have one database copied three times, and another database copied only one time? Answer: Yes. Database copies are configured on a per-database basis, thus you can decide how many copies of a specific database you want to create, and on which Exchange servers they should be hosted. References Exchange Server 2010 Help: Understanding Mailbox Database Availability Module 7: Implementing High Availability Course 10135B
  12. This topic discusses all the relevant DAG configuration settings that are available to configure a DAG. You should explain the details about each point and mention when to use it. For example, mention that you need to configure DAG Networks only if you have multiple sub-nets and the automatic discovery of these subnets was not working correctly. Module 7: Implementing High Availability Course 10135B
  13. Discuss the process of configuring a database to be highly available within a DAG. This includes to describe and discuss the importance to configure truncation, replay lag times, and preferred list sequence number. Question: How do you plan to use the preferred list sequence number? Answer: Answers may vary. However, many students will prefer to spread the activity to multiple servers. Rotating the preference for the databases through all available servers allows each server to serve client requests actively. References Exchange Server 2010 Help: Add a Mailbox Database Copy Module 7: Implementing High Availability Course 10135B
  14. Demonstrate how to create a DAG by adding at least two servers to the group, and then adding a copy of one of the databases. Discuss the prerequisites for configuring a DAG, such as Windows Server 2008 Enterprise, and that Microsoft recommends that the failover clustering feature be installed prior to creating a DAG. Preparation Ensure that the 10135A-DC1, 10135A-VAN-EX1, and 10135A-VAN-EX2 virtual machines are running. Log on to 10135A-VAN-EX1 as Administrator with the password of Pa$$w0rd . Demonstration Steps On VAN-EX1, click Start , click All Programs , click Microsoft Exchange Server 2010 , and then click Exchange Management Console . In the console tree, expand Microsoft Exchange On-Premises , expand Organization Configuration , and then click Mailbox . In the results pane, click the Database Availability Groups tab. On Actions pane, click New Database Availability Group . In the New Database Availability Group Wizard , type DAG1 in the Database availability group name field, click Witness Server , and then type VAN-DC1 in the host name field. Click Witness Directory , type C:\\FSWDAG1 in the path field, and then click New . In the New Database Availability Group Wizard , on Completion page, you can ignore the warning, and then click Finish . Note: In an actual business environment we recommend that you use the local Hub Transport server, not a domain controller, to act as the file share witness. A two-node DAG configuration requires a file share witness, since it requires a majority of votes at all times to maintain quorum. In a two-node cluster without a file share witness, when you reboot one of the nodes, a majority of votes cannot be obtained, so quorum is not maintained, and the cluster fails. You can specify the Hub Transport server and the local directory that you want to configure as the file share witness when you create a DAG. As a best practice, you should add the file share witness to other clusters too. Clusters with even numbers of nodes use the file share witness as a tiebreaker vote in establishing quorum. In the Work pane on the Database Availability Groups tab, right-click DAG1 , and then click Properties from the context menu. In the DAG1 properties window, click IP Addresses tab. In IP Addresses tab, click Add, and then type 10.10.0.25 in Database availability group IP addresses field. Click OK , and then click OK to leave DAG1 properties. In the Microsoft Exchange Warning window, click OK . Module 7: Implementing High Availability Course 10135B
  15. In the Work pane on the Database Availability Groups tab, right-click DAG1 , and then click Manage Database Availability Group Membership from the context menu. In the Manage Database Availability Group Membership Wizard, click Add . In the Select Mailbox Server dialog box, click VAN-EX1 , and then click OK . In the Manage Database Availability Group Membership Wizard, click Add . In the Select Mailbox Server dialog box, click VAN-EX2 , and then click OK . In the Manage Database Availability Group Membership Wizard, click Manage to complete the changes, and then click Finish to close the wizard. In the Results pane, click the Database Management tab. In the Results pane, click Mailbox Database 1 , and then in the Actions pane, click Add Mailbox Database Copy . If you do not see Add Mailbox Database Copy on Actions pane, click Refresh . In the Add Mailbox Database Copy Wizard, click Browse to select the server to which to add the copy. In the Select Mailbox Server dialog box, click VAN-EX2 , and then click OK . In the Add Mailbox Database Copy Wizard, click Add to create the copy of Mailbox Database 1. Review the results, and then click Finish .   Note  Once you create a DAG, you then can create and configure DAG networks for replication or for MAPI traffic. Add additional networks for redundancy or improved throughput. Question : What information do you need before you can configure a DAG? Answer: At minimum, the administrator needs to know within which network the DAG will reside and the servers that will participate. Reference Exchange Server 2010 Help: Create a Database Availability Group, Managing Database Availability Group Membership, and Managing Database Availability Groups Module 7: Implementing High Availability Course 10135B
  16. The transport dumpster was first introduced in Exchange Server 2007 to allow for recovery of database failovers of Clustered Continuous Replication (CCR) and Local Continuous Replication (LCR) availability solutions. Module 7: Implementing High Availability Course 10135B
  17. Discuss what would happen if a failure occurred and not all of the logs could be copied to the passive copy before you need to make it active. Discuss how to use the transport dumpster when failover occurs. Active manager selects the best copy to activate, as follows: 1. Enumerates the available copies 2 Ignores all servers that are unreachable 3. Sorts available copies by how up-to-date they are (content index, copy queue length and replay queue length). Breaks ties based on activation preference Activity Draw a simple Exchange Server messaging environment on the board. Be sure to include one Hub Transport server and two Mailbox servers configured in a DAG. Now walk through a scenario in which a message is delivered through the Hub Transport server to the active database on the first Mailbox server. Just after Exchange Server delivers the message, the server fails, and no data is recovered. Now walk through the steps on the slide to discuss how this database failover process works. Discuss the AutoDatabaseMountDial setting and when to use it. The three possible values for the AutoDatabaseMountDial setting are: BestAvailability . If you specify this value, the database automatically mounts if the copy queue length is less than or equal to twelve. The copy queue length is defined as the number of logs that the passive copies recognize as needing to be replicated. If the copy queue length is more than twelve, the database will not automatically mount. When the copy queue length is less than or equal to twelve, Exchange Server attempts to replicate the remaining logs to the passive copies and mount the database. GoodAvailability . If you specify this value, the databases automatically mount immediately after a failover, if the copy queue length is less than or equal to six. If the copy queue length is more than six, the databases do not automatically mount. When the copy queue length is less than or equal to six, Exchange Server attempts to replicate the remaining logs to the passive copy and mount the database. Lossless . If you specify this value, the databases do not automatically mount until all logs that were generated on the active copy are copied to the passive copy. References Exchange Server 2010 Help: Set-MailboxServer Module 7: Implementing High Availability Course 10135B
  18. You should only allocate permissions to those who are knowledgeable about DAGs, and are responsible for their maintenance. The ability to manage DAGs and database copies may not be required by most Exchange Server administrators. One of the unique problems in managing DAGs is that you may not notice the failure of a DAG member immediately, because users are not affected. To actively monitor DAG members, consider using Microsoft Systems Center Operations Manager 2012, or another monitoring tool. Question: Which users in your organization will have permission to manage DAGs? Answer: Answers will vary. In general, large organizations may have a small, dedicated group for managing the DAG-level components. Mid-sized organizations will most likely allow Exchange Server administrators to manage the DAG. Module 7: Implementing High Availability Course 10135B
  19. Demonstrate how to use the Exchange Management Console and Exchange Management Shell to review the available information regarding database replication health. In the demonstration, show how to view the health status of the database copies in the Exchange Management Console or Exchange Management Shell. Preparation Ensure that the 10135A-DC1, 10135A-VAN-EX1 and 10135A-VAN-EX2 virtual machines are running. Log on to 10135A-VAN-EX1 as Administrator with the password of Pa$$w0rd . Demonstration Steps On VAN-EX1, click Start , click All Programs , click Microsoft Exchange Server 2010 , and then click Exchange Management Console . In the Console Tree, expand Microsoft Exchange On-Premises , expand Organization Configuration , and then click Mailbox . In the Results pane, click the Database Management tab. In the Results pane, click Mailbox Database 1 , and then in the Actions pane, in the bottom Mailbox Database 1 area, click Properties . Review the information on the General tab: The database status might be Healthy, Initializing, Failed, Mounted, Dismounted, Disconnected, Suspended and Failed, Suspended, Resynchronizing, Seeding. Describe Copy queue length (logs) and Replay queue length (logs) . Click OK to close. Click Start , click All Programs , click Microsoft Exchange Server 2010 , and then click Exchange Management Shell . At the Exchange Management Shell prompt, type Test-ReplicationHealth , and then press Enter. This cmdlet performs a variety of tests, and reports back the status for the replication components. At the Exchange Management Shell prompt, type Get-MailboxDatabaseCopyStatus , and then press Enter. This cmdlet shows status information about all database copies available on the local server. At the Exchange Management Shell prompt, type CD “C:\\Program Files\\Microsoft\\Exchange Server\\V14 At the Exchange Management Shell prompt, type .\\CheckDatabaseRedundancy.ps1 –MaiboxDatabaseName “Mailbox Database 1” , and then press Enter. Notice that the CurrentStatus should be Green to indicate that the all databases are healthy and the database is mounted on the server with the lowest Activation preference number (APN). Question : Why is monitoring these statistics important? Answer : High availability is more than just redundant software and hardware. It is a crucial tool for identifying and reacting to problems quickly and effectively. Monitoring the statistics can help you do this. Module 7: Implementing High Availability Course 10135B
  20. Module 7: Implementing High Availability Course 10135B
  21. Discuss creating a client access array, and that you should create an array of Client Access servers (using the New-ClientAccessArray cmdlet) in each site, and then load-balance using one of the methods listed on the slide. Question: What three load-balancing possibilities can you use for your Client Access servers? Answer: You can use hardware load balancing, software load balancing, or round-robin Domain Name System (DNS). Course 10165A Module 8: Implementing High Availability
  22. Walk the students through a typical shadow redundancy scenario: Hub delivers message to Edge Hub opens a Simple Mail Transfer Protocol (SMTP) session with Edge. Edge advertises shadow redundancy support. Hub notifies Edge to track discard status. Hub submits message to Edge. Edge acknowledges the receipt of message, and records the Hub’s name for sending the message’s discard information. Hub moves the message to the shadow queue for Edge, and marks Edge as the primary server. Hub becomes the shadow server. Edge delivers message to next hop: Edge submits message to third-party mail server. Third party mail server acknowledges the message’s receipt. Edge updates the message’s discard status as delivery complete. Hub queries Edge for discard status (success case): At end of each SMTP session with Edge, Hub queries Edge for discard status on messages previously submitted. If Hub has not opened any SMTP sessions with Edge after the initial message submission, it will open an SMTP session with Edge just to query for discard status after a specific time. Edge checks local discard status, sends back the list of messages that have been delivered, and removes the discard information. Hub server deletes the list of messages from its shadow queue. Hub queries Edge for discard status and resubmits the message (failure case): If Hub cannot contact Edge, Hub resumes the primary server role, and resubmits the messages in the shadow queue. Resubmitted messages are delivered to another Edge server, and the workflow starts from step 1. Within Exchange Server 2010, the Shadow Redundancy Manager is the core component of a Transport server that is responsible for managing shadow redundancy. The Shadow Redundancy Manager is responsible for maintaining the following information for all the primary messages that a server is currently processing: The shadow server for each primary message being processed. The discard status to be sent to shadow servers. Explain what feature Shadow redundancy promotion is and that it is available since Exchange Server 2010 SP1. If time permits, allow the students to create other scenarios that might occur, and then use the board to walk through how shadow redundancy provides easy recovery. Question: On which Exchange 2010 server roles can you find the Shadow Redundancy Manager? Answer: You find the Shadow Redundancy Manager on the servers roles that route messages, such as Hub Transport server role and Edge Transport server role. Module 7: Implementing High Availability Course 10135B
  23. Discuss the high availability options for the Edge Transport role. Be sure to mention why you might choose these options. Edge Transport also supports shadow redundancy, as mentioned in the previous slide, but shadow redundancy does not cover all scenarios because most of the messaging servers to which the Edge Transport role communicates do not support shadow redundancy. Acknowledge that you would use similar methods if you were using Hub Transports for inbound and outbound email. Module 7: Implementing High Availability Course 10135B
  24. Module 7: Implementing High Availability Course 10135B
  25. The design of DAGs allows you to add a Mailbox server from an alternate site. Then, you can add a database copy to the servers in the alternate sites as well. By doing this, the DAG is configured for site resilience, and no special configuration is required. However, network latency must be below 250 milliseconds round-trip time, otherwise the DAG members cannot communicate. The alternate site must have Hub Transport and Client Access servers available, but that is a basic requirement of installing a Mailbox server in that location. Those roles can be placed on the Mailbox server, if required. Datacenter Activation Coordination mode can be used for DAGs with three or more members extended to multiple sites. The Datacenter Activation Coordination mode prevents split-brain syndrome by not allowing automatic recovery to the main site even if quorum is achieved, because the remote site may be running the databases. Question: What is the maximum round trip network latency time of DAG members? Answer: The maximum is 250 milliseconds between each DAG member. Module 7: Implementing High Availability Course 10135B
  26. Explain why Datacenter Activation Coordination mode was implemented in Exchange Server 2010. The scenario shown in the animation is a Datacenter Activation Coordination mode with a DAG spanning two Active Directory® directory service sites. Database 1 is mounted due to quorum. Talk the students through a site failover process with the animation, as follows: The primary data center and the wide area network (WAN) link fails. As the database was very important, you decide to bring the database online at the secondary data center. You use a manual process to mount the database. Once the servers at the primary data center are back online—the WAN connection is still down—no server will automatically mount a database, because their DACPs are 0 as they cannot reach the servers in the secondary data center. When the WAN link is recovered, the databases at the primary site “see” the databases at the secondary site, and begin to synchronize. You then can safely mount back the active database to the primary datacenter. Be sure to mention that Exchange Server 2010 SP1 supports Datacenter Activation Coordination mode with two-member DAGs that have each member in a separate data center. Question: When should you consider configuring the Datacenter Activation Coordination mode? Answer: You should consider the Datacenter Activation Coordination mode as soon as you configure DAG members located at two or more Active Directory sites. Reference Understanding Datacenter Activation Coordination Mode http://go.microsoft.com/fwlink/?LinkID=185451 Module 7: Implementing High Availability Course 10135B
  27. Mention that site resilience requires the following server roles to be available in each site besides the Mailbox roles that you already configured in a DAG: Active Directory Domain Controller Hub Transport server Client Access server Explain that the DC; Hub Transport and Client Access servers do not require any special configuration to support site resiliency, but need to be available in the remote site. Question: When using Exchange Server 2010 SP1 or newer, do you need to update the RPCClientAccessServer property when a site failover occurs and your Client Access servers at the original site are not available anymore? Answer: No, if you configured cross-site direct connect. Module 7: Implementing High Availability Course 10135B
  28. You must enable DAC mode for DAGs to allow for an administrator-activated site switchover. When the primary data center fails, the switchover process includes: DNS records are adjusted for Simple Mail Transfer Protocol (SMTP), Outlook Live, Autodiscover, Outlook Anywhere, and any legacy protocols (if necessary). You configure DAG to remove the primary site servers from the Windows Failover Cluster, but not to remove them from the DAG. You configure DAG to use an alternative file-share witness, and to restore the secondary site’s functionality. Remaining Active Managers coordinate mounting databases in secondary site. To switch back, replicate the database copy back to the original site, and then activate it in the original site. The activation does not need to be done during off-hours, but can be done to reduce the risk of a problem affecting users. Question: What is the basic requirement for switching the active database to a secondary data center? Answer: The DAG needs to be in Datacenter Activation Coordination mode. Module 7: Implementing High Availability Course 10135B
  29. Discuss the following best practices that users should follow for multisite failover. Ask the students about the practices that they find useful, or that they expect to find useful. Use time-to-live (TTL) on DNS records for the Client Access server array, Client Access server URLs, and SMTP records, to reduce failover time. Having a low TTL enables the DNS clients to more quickly discover the DNS entries that point to the secondary site. Monitor all aspects of the Exchange Server environment to ensure that it is functioning normally, and that mailbox data is successfully being replicated in a timely manner to the secondary site. Also, talk about scheduling periodic failover tests to provide an additional level of preparation, and to provide validation to the configuration and operation of the cross-site failover process. Change management is important, and you should use it to ensure that each Mailbox server in the DAG, each Client Access server, and each Hub Transport server are configured identically and have the same updates applied. Doing this reduces the possibility of incompatibilities and/or the unexpected behavior that could occur after a failover. To reduce network congestion and potential communications problems, cluster heartbeat communications should not be allowed between these networks. Question: In your organization, do you regularly perform disaster recovery activities such as recovering a backup in Exchange Server? Answer: The answer depends on the organization. However, it is a best practice to regularly practice failover, backup, and recovery to ensure that Exchange Server 2010 performs as expected. Module 7: Implementing High Availability Course 10135B
  30. In this lab, students will: Deploy a DAG Deploy highly available Hub Transport and Client Access servers Exercise 1 In this exercise, students will deploy a DAG and test continuous replication. Inputs: Students will be provided with the instructions on how to configure a DAG, and then how to configure multiple database copies. Outputs: Students will configure a DAG and configure multiple copies of a database between multiple servers. Students also will verify that continuous replication is working by performing a database switchover. Exercise 2 In this exercise, students will configure and test highly available Client Access and Hub Transport servers. Inputs: Students will be given instructions on how to make the Client Access and Hub Transport servers highly available. Outputs: Students will configure and verify network load balancing for Client Access servers, and verify that the Hub Transport servers are highly available. Exercise 3 In this exercise, students will verify that the high availability configuration is working properly. Inputs: Students will be give instructions on how to test the configuration for successful configuration. Outputs: Students will use the SMTP service and Queue Viewer to verify messages, and simulate a server failure. Before the students begin the lab, read the scenario associated with each exercise to the class. This will reinforce the broad issue that the students are troubleshooting and will help to facilitate the lab discussion at the module’s end. Remind the students to complete the discussion questions after the last lab exercise. Module 7: Implementing High Availability Course 10135B
  31. Module 7: Implementing High Availability Course 10135B
  32. Use the questions on the slide to guide the debriefing after students complete the lab exercises. Question: When might you choose to initiate a database switchover? Answer: You can initiate database switchovers to move databases off a DAG member for maintenance tasks, such as applying software updates. Question: If you deploy only two Hub Transport servers in an Active Directory site, would shadow redundancy protect messages between mailboxes in the same site? Answer: Shadow redundancy does not protect messages delivered within the same site, because the messages will not have traversed more than one Hub Transport server. However, you can recover these messages using the transport dumpster functionality. Module 7: Implementing High Availability Course 10135B
  33. Review Questions Question: What are the requirements for using the Datacenter Activation Coordination mode? Answer: You require Exchange Server 2010 SP1 or newer, the DAG must have at least two members, and span at least two Active Directory sites. Question: Which Exchange Server 2010 feature provides fault tolerance for message delivery? Answer: Shadow redundancy is a new feature in Exchange Server 2010 that ensures messages are not lost when a Hub Transport server fails. The sending Hub Transport server keeps a copy of a message until the next hop reports the message as delivered. Question: In which scenarios might you use hardware load balancing with Edge Transport servers? Answer: In high utilization scenarios requiring hundreds of Edge Transport servers, it may make more sense to use a hardware load balancer than to create hundreds of DNS MX records. Doing this also may reduce the number of public IP addresses required. Question: Besides planning for Exchanger Server failures, what other failures should you consider? Answer: Exchange Server high availability configurations protect against software and server failures, and database corruption. It is important to consider larger issues, such as local network failures, Internet connectivity issues, and data center power and cooling failures. Question: In which scenarios might you use hardware load balancing with Edge Transport servers? Answer: In high utilization scenarios requiring hundreds of Edge Transport servers, it may make more sense to use a hardware load balancer than to create hundreds of DNS MX records. Doing this also may reduce the number of public IP addresses required. Question: To make a highly available Exchange Server 2010 organization, which components must be highly available? Answer : All components need to be highly available. If one component in the Exchange Server 2010 organization is not highly available, then the entire Exchange Server 2010 organization is not highly available. You need to consider high availability for all Exchange Server roles, supporting services, and infrastructure. Question: How many networks should you use for a DAG? Answer : You should use at least two networks for a DAG: the first network is dedicated to replication of transaction logs; the second network is primarily used for Messaging Application Programming Interface (MAPI) communication, but can also be used for replication if the replication network fails. Module 7: Implementing High Availability Course 10135B
  34. Common Issues Related to Creating High Availability Edge Transport Solutions Identify the causes for the following common issues related to high availability Edge Transport servers, and fill in the troubleshooting tips. For answers, refer to relevant lessons in the module. Real-World Issues and Scenarios Question: An organization has several branch offices with a small number of employees. However, the organization needs to deploy a high availability solution in the remote offices. What configuration can it deploy to meet it business needs? Answer: It may be possible to deploy two servers and install the Mailbox, Hub Transport, and Client Access server roles on both. The organization can create a DAG and use a hardware load balancer to load balance client access connectivity. Question: An organization uses a variety of service-level agreements for database availability for different business units. It wants to minimize the number of mailbox servers it deploys. How can it do this? Answer: Deploy all Mailbox servers in a single DAG, and then configure each of the business unit’s mailbox databases with the appropriate number of copies to meet the service.   Module 7: Implementing High Availability Course 10135B Issue Troubleshooting tip Inbound email is not being delivered evenly across all of the Edge Transport servers. Ensure that the DNS MX records have the same value. If the values are not the same, only the records with the lowest value will be used. After deploying highly available Edge Transport servers, outbound email is being returned as possible spam. Verify that your outbound mail servers are configured with a host name that is resolvable on the Internet. Many servers reject email from servers that do not have a name or an IP address that can be resolved on the Internet.
  35. Best Practices Related to Designing a High Availability Solution Supplement or modify the following best practices for your own work situations: Identify all possible failure points before designing a solution. Even the most elaborate and expensive designs can have a simple and crippling failure point. Document all of the components to the solution so that everyone involved in the deployment understands how the solution is configured. Follow change-management procedures. In some environments, it may be tempting to skip these steps. However, not following proper change-management procedures often leads to extended, unplanned downtime. Use a client access array and load-balancing to make client access highly available. If a Client Access server is also a member of a DAG, then use hardware-based load-balancing. Ensure that Internet-accessible sites that proxy Client Access for multiple sites are highly available, because their outage will affect many users. When a mailbox database fails over to an alternate site for a short period of time, allow the clients to continue using the client access array in the original site. Module 7: Implementing High Availability Course 10135B