Exchange Server 2013
High Availability and Site Resilience
(1/2)
Scott Schnoll
Senior Content Developer
Microsoft Corporation
scott.schnoll@microsoft.com
http://aka.ms/Schnoll
Twitter: @Schnoll

Infrastructure, communication & collaboration
Agenda – Part 1
• Database Availability Group Internals
• Witness Server

#mstechdays

Infrastructure, communication & collaboration
Agenda – Part 2 (16:30-17:15, salle 253)
• Dynamic quorum
• DAG member maintenance

#mstechdays

Infrastructure, communication & collaboration
DATABASE AVAILABILITY GROUP
INTERNALS

#mstechdays

Infrastructure, communication & collaboration
DAG Internals
•
•
•
•
•

Microsoft Exchange Replication service
Active Manager Client component
Microsoft DAG Management service
Cluster service and components
Windows Crimson Channel

#mstechdays

Infrastructure, communication & collaboration
DAG Replication Service
• Introduced in Exchange 2007 RTM
– Microsoft Exchange Replication service |
MSExchangeRepl
– MSExchangeRepl.exe
– Required on all Mailbox servers (not just DAG
members)
– Communicates with Active Directory and other DAG
members
#mstechdays

Infrastructure, communication & collaboration
DAG Replication Service
• Includes 16 components
Active Directory lookup
Copy status lookup
Replay core manager
Seed manager
Autoreseed manager
• Disk reclaimer manager
• Replay RPC server wrapper
• Remote data provider
wrapper
•
•
•
•
•

#mstechdays

• Active Manager
• Active Manager RPC server
wrapper
• Failure item manager
• TPR API manager
• Support API manager
• Server locator manager
• Health state tracker
• VSS Writer

Infrastructure, communication & collaboration
Active Manager Client Component
• Runs inside client access and transport
services
– Microsoft Exchange RPC Client Access MSExchangeRPC
– Microsoft Exchange Transport –
MSExchangeTransport
– Microsoft Exchange Frontend Transport –
MSExchangeFrontendTransport
– Client Access Front End (CAFÉ) components
#mstechdays

Infrastructure, communication & collaboration
Active Manager Client Component
• When connecting clients or routing
messages, client access and transport
services query Active Directory and Active
Manager to find out location of the active
copy of a mailbox
– DAG members with Standby Active Manager (SAM)
role respond to these queries
#mstechdays

Infrastructure, communication & collaboration
DAG Management Service
• Introduced in Exchange 2013 CU2
– Microsoft Exchange DAG Management service | MSExchangeDagMgmt

– MSExchangeDagMgmt.exe
– Runs on all Mailbox servers (not just DAG members)
– Communicates with Active Directory and other DAG
members

• Includes 4 components
–
–
–
–
#mstechdays

Active Directory lookup
Copy status lookup
Monitoring
Tracer instance
Infrastructure, communication & collaboration
DAG Management Service
• Created for two primary reasons:
– so the Replication service can have more focused
functionality
– so Managed Availability actions can kill lower-priority
activities

• Logs events in same place as Replication
service
• Other functions will move to this service
– AutoReseed, Disk reclaimer, Dynamic replay lag playdown
#mstechdays
Infrastructure, communication & collaboration
– Future AutoDAG copy layout and mobility features
Cluster service
• Introduced in NT Server Enterprise Edition
(1997)
– Cluster Service | ClusSvc
– Clussvc.exe

• Exchange DAGs use several cluster
components
– Membership and node management
– Networks and heartbeating
– Quorum
#mstechdays

Infrastructure, communication & collaboration
Cluster service
• Quorum is required in order to mount databases
• Quorum is based on votes, not membership
• Voting can be rigged
– Votes can be taken away manually or dynamically

• Exchange manages quorum model, not quorum
– Exchange management of quorum model based on nodes, not
votes
– Removing votes requires manual configuration of quorum model
– Exchange will make incorrect quorum model management
decisions if votes are manually removed at the cluster level
#mstechdays

Infrastructure, communication & collaboration
Cluster registry
• Active Manager stores database / server information in the cluster
registry for DAG members
– Registry changes are replicated immediately to all DAG members

• Stored information is used as part of BCSS

#mstechdays

Infrastructure, communication & collaboration
Cluster registry
• ActiveServer
– Name of server where database is currently mounted or
expected to be mounted when mount operation completes

• LastMountServer
– Name of server where database was last successfully
mounted

• LastMountedTime
– Date and time stamp of the last time database was mounted
#mstechdays

Infrastructure, communication & collaboration
Cluster registry
• MountStatus
– Current mount status for database (mounted /
dismounted)

• IsAdminDismounted
– Designates whether current dismounted status is the
result of administrator action (true / false)

• IsAutomaticActionsAllowed
– Designates whether the database can be automatically
activated (true / false)
#mstechdays

Infrastructure, communication & collaboration
Crimson Channel
• Applications and Services logs
– Area of Windows Server event log used by applications for logging
and internal communication
– These logs store events from a single application or component
rather than events that might have system-wide impact
– This is referred to as an application's crimson channel

• Exchange 2013 has a crimson channel with multiple
areas
–
–
–
–
–
–
#mstechdays

ActiveMonitoring
HighAvailability
MailboxDatabaseFailureItems
ManagedAvailability
PushNotifications
TroubleshootersInfrastructure, communication & collaboration
#mstechdays

Infrastructure, communication & collaboration
WITNESS SERVER AND WITNESS
SERVER PLACEMENT

#mstechdays

Infrastructure, communication & collaboration
Witness Server
• A server that participates in a failover
cluster with an even number of members
– Is not a member of the cluster/DAG
– Does not contain a copy of quorum data

• File share on this server is represented by
File Share Witness resource in cluster core
resource group
– Uses IsAlive check for availability
#mstechdays

Infrastructure, communication & collaboration
Witness Server
• File Share Witness Resource Behavior
– If server or share are not available, cluster resources
are failed and moved to another node
– If FSW resource does not come back online, it remains
in a Failed state, with restart attempts every 60
minutes
– If witness server needed for quorum, and resource
cannot be brought online, quorum will be lost
• Single restart attempt for FSW resource in Failed
state
#mstechdays
Infrastructure, communication & collaboration
Witness Server
• When witness server is needed to maintain
quorum, one of the nodes locks the
witness.log on witness server
– Node that locks witness.log file is called the locking
node
– If enough nodes are in contact with the locking node to
constitute a majority, quorum is maintained
– Nodes that can’t communicate with locking node lose
quorum
#mstechdays

Infrastructure, communication & collaboration
Witness Server
• Attempts to lock witness.log file occur in a
specific order
– Node that owns cluster core resource group tries
immediately
– Nodes not owning cluster core resources wait 6
seconds before trying

#mstechdays

Infrastructure, communication & collaboration
Witness Server
•Cluster Core Resources
•Sequence #: 20
22
21

•Lock witness.log
•Sequence #: 21

•Sequence #: 20
22
Challenging node
attempts witness lock.
Lock already exists –
sequence # higher,
challenge not successful.

Cluster state change –
node owning cluster
core resources locks FSW
– updates sequence
number

0

#mstechdays

1

2

3

4

5

6

7

8

9

10

11

Infrastructure, communication & collaboration

12

All nodes available.
FSW lock released.
Changes replicated,
sequence numbers in
sync.

13

14

15

16
Witness Server
•Sequence #: 22
•Cluster Core Resources
•Sequence #: 20

•Lock witness.log
•Sequence #: 21

•Sequence #: 20
•Cluster Core Resources
•Sequence #: 21
22
Cluster state change –
node owning cluster
core resources
unavailable.

0

#mstechdays

1

2

3

4

5

Challenging node
attempts witness lock.
No lock exists, lock
successful, sequence
number updated.

6

7

8

9

10

11

Infrastructure, communication & collaboration

All nodes available.
FSW lock released.
Changes replicated,
sequence numbers in
sync.

12

13

14

15

16
Witness Server Placement
• Exchange 2010 guidance
– “We recommend that you use a Hub Transport server running
on Exchange Server 2010 in the Active Directory site
containing the DAG. This allows the witness server/directory to
remain under the control and visibility of an Exchange
administrator.”
– “If your DAG is extended to multiple datacenters, we
recommend deploying the witness server in the datacenter
considered to be the primary datacenter.”
#mstechdays

Infrastructure, communication & collaboration
Witness Server Placement
• Exchange 2013 guidance more complicated
due to options introduced by architectural
changes
• Exchange 2013 includes support for new DAG
configuration options
– A third location, such as a third physical datacenter or
branch office

• Ultimately, the placement of a DAG’s witness
server depends on your business
#mstechdays
requirements and the options available to you
Infrastructure, communication & collaboration
Witness Server Placement
Deployment scenario

Placement Recommendation

Single DAG deployed in one
datacenter

Locate witness server in the same datacenter as DAG members

Single DAG deployed across two
datacenters; no additional locations
available

Locate witness server in primary datacenter

Multiple DAGs deployed in one
datacenter

Locate witness server in the same datacenter as DAG members.
Additional options include:
• Using the same witness server for multiple DAGs
• Using a DAG member to act as a witness server for a different DAG

Multiple DAGs deployed across two
datacenters

Locate witness server in the same datacenter as DAG members.
Additional options include:
• Using the same witness server for multiple DAGs
• Using a DAG member to act as a witness server for a different DAG

Single or Multiple DAGs deployed
across more than two datacenters

Locate the witness server in the datacenter where you want the majority
of quorum votes to exist

#mstechdays

Infrastructure, communication & collaboration
Witness Server Placement
• If your organization has a 3rd location, a
witness server can be deployed there for
automatic database failover between two
other sites
– The witness server location must have network
infrastructure and connectivity that is isolated from network
failures that affect the two datacenters with DAG members

• For all DAGs, the availability of the witness
server should be on the Exchange
#mstechdays

Infrastructure, communication & collaboration
Witness Server Placement
• IaaS providers and cloud providers are not
supported for use as a witness server
– This includes Windows Azure, which does not yet
support the required underlying network configuration
to allow an Azure file server VM to act as a witness
server in a multi-datacenter deployment
– More info at http://aka.ms/DAGAzure

#mstechdays

Infrastructure, communication & collaboration
Related Content
• Exchange 2013 Haute disponibilité et
tolérance aux sinistres (Session 2/2
première partie) – 12/02/14 - 16:30-17:15,
salle 253
• Exchange 2013 Dimensionnement et
Performance – 12/02/14 – 17:45-18:30,
salle 252B
#mstechdays

Infrastructure, communication & collaboration
QUESTIONS?
Thank You!

#mstechdays

Infrastructure, communication & collaboration

Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 première partie)

  • 2.
    Exchange Server 2013 HighAvailability and Site Resilience (1/2) Scott Schnoll Senior Content Developer Microsoft Corporation scott.schnoll@microsoft.com http://aka.ms/Schnoll Twitter: @Schnoll Infrastructure, communication & collaboration
  • 3.
    Agenda – Part1 • Database Availability Group Internals • Witness Server #mstechdays Infrastructure, communication & collaboration
  • 4.
    Agenda – Part2 (16:30-17:15, salle 253) • Dynamic quorum • DAG member maintenance #mstechdays Infrastructure, communication & collaboration
  • 5.
  • 6.
    DAG Internals • • • • • Microsoft ExchangeReplication service Active Manager Client component Microsoft DAG Management service Cluster service and components Windows Crimson Channel #mstechdays Infrastructure, communication & collaboration
  • 7.
    DAG Replication Service •Introduced in Exchange 2007 RTM – Microsoft Exchange Replication service | MSExchangeRepl – MSExchangeRepl.exe – Required on all Mailbox servers (not just DAG members) – Communicates with Active Directory and other DAG members #mstechdays Infrastructure, communication & collaboration
  • 8.
    DAG Replication Service •Includes 16 components Active Directory lookup Copy status lookup Replay core manager Seed manager Autoreseed manager • Disk reclaimer manager • Replay RPC server wrapper • Remote data provider wrapper • • • • • #mstechdays • Active Manager • Active Manager RPC server wrapper • Failure item manager • TPR API manager • Support API manager • Server locator manager • Health state tracker • VSS Writer Infrastructure, communication & collaboration
  • 9.
    Active Manager ClientComponent • Runs inside client access and transport services – Microsoft Exchange RPC Client Access MSExchangeRPC – Microsoft Exchange Transport – MSExchangeTransport – Microsoft Exchange Frontend Transport – MSExchangeFrontendTransport – Client Access Front End (CAFÉ) components #mstechdays Infrastructure, communication & collaboration
  • 10.
    Active Manager ClientComponent • When connecting clients or routing messages, client access and transport services query Active Directory and Active Manager to find out location of the active copy of a mailbox – DAG members with Standby Active Manager (SAM) role respond to these queries #mstechdays Infrastructure, communication & collaboration
  • 11.
    DAG Management Service •Introduced in Exchange 2013 CU2 – Microsoft Exchange DAG Management service | MSExchangeDagMgmt – MSExchangeDagMgmt.exe – Runs on all Mailbox servers (not just DAG members) – Communicates with Active Directory and other DAG members • Includes 4 components – – – – #mstechdays Active Directory lookup Copy status lookup Monitoring Tracer instance Infrastructure, communication & collaboration
  • 12.
    DAG Management Service •Created for two primary reasons: – so the Replication service can have more focused functionality – so Managed Availability actions can kill lower-priority activities • Logs events in same place as Replication service • Other functions will move to this service – AutoReseed, Disk reclaimer, Dynamic replay lag playdown #mstechdays Infrastructure, communication & collaboration – Future AutoDAG copy layout and mobility features
  • 13.
    Cluster service • Introducedin NT Server Enterprise Edition (1997) – Cluster Service | ClusSvc – Clussvc.exe • Exchange DAGs use several cluster components – Membership and node management – Networks and heartbeating – Quorum #mstechdays Infrastructure, communication & collaboration
  • 14.
    Cluster service • Quorumis required in order to mount databases • Quorum is based on votes, not membership • Voting can be rigged – Votes can be taken away manually or dynamically • Exchange manages quorum model, not quorum – Exchange management of quorum model based on nodes, not votes – Removing votes requires manual configuration of quorum model – Exchange will make incorrect quorum model management decisions if votes are manually removed at the cluster level #mstechdays Infrastructure, communication & collaboration
  • 15.
    Cluster registry • ActiveManager stores database / server information in the cluster registry for DAG members – Registry changes are replicated immediately to all DAG members • Stored information is used as part of BCSS #mstechdays Infrastructure, communication & collaboration
  • 16.
    Cluster registry • ActiveServer –Name of server where database is currently mounted or expected to be mounted when mount operation completes • LastMountServer – Name of server where database was last successfully mounted • LastMountedTime – Date and time stamp of the last time database was mounted #mstechdays Infrastructure, communication & collaboration
  • 17.
    Cluster registry • MountStatus –Current mount status for database (mounted / dismounted) • IsAdminDismounted – Designates whether current dismounted status is the result of administrator action (true / false) • IsAutomaticActionsAllowed – Designates whether the database can be automatically activated (true / false) #mstechdays Infrastructure, communication & collaboration
  • 18.
    Crimson Channel • Applicationsand Services logs – Area of Windows Server event log used by applications for logging and internal communication – These logs store events from a single application or component rather than events that might have system-wide impact – This is referred to as an application's crimson channel • Exchange 2013 has a crimson channel with multiple areas – – – – – – #mstechdays ActiveMonitoring HighAvailability MailboxDatabaseFailureItems ManagedAvailability PushNotifications TroubleshootersInfrastructure, communication & collaboration
  • 19.
  • 20.
    WITNESS SERVER ANDWITNESS SERVER PLACEMENT #mstechdays Infrastructure, communication & collaboration
  • 21.
    Witness Server • Aserver that participates in a failover cluster with an even number of members – Is not a member of the cluster/DAG – Does not contain a copy of quorum data • File share on this server is represented by File Share Witness resource in cluster core resource group – Uses IsAlive check for availability #mstechdays Infrastructure, communication & collaboration
  • 22.
    Witness Server • FileShare Witness Resource Behavior – If server or share are not available, cluster resources are failed and moved to another node – If FSW resource does not come back online, it remains in a Failed state, with restart attempts every 60 minutes – If witness server needed for quorum, and resource cannot be brought online, quorum will be lost • Single restart attempt for FSW resource in Failed state #mstechdays Infrastructure, communication & collaboration
  • 23.
    Witness Server • Whenwitness server is needed to maintain quorum, one of the nodes locks the witness.log on witness server – Node that locks witness.log file is called the locking node – If enough nodes are in contact with the locking node to constitute a majority, quorum is maintained – Nodes that can’t communicate with locking node lose quorum #mstechdays Infrastructure, communication & collaboration
  • 24.
    Witness Server • Attemptsto lock witness.log file occur in a specific order – Node that owns cluster core resource group tries immediately – Nodes not owning cluster core resources wait 6 seconds before trying #mstechdays Infrastructure, communication & collaboration
  • 25.
    Witness Server •Cluster CoreResources •Sequence #: 20 22 21 •Lock witness.log •Sequence #: 21 •Sequence #: 20 22 Challenging node attempts witness lock. Lock already exists – sequence # higher, challenge not successful. Cluster state change – node owning cluster core resources locks FSW – updates sequence number 0 #mstechdays 1 2 3 4 5 6 7 8 9 10 11 Infrastructure, communication & collaboration 12 All nodes available. FSW lock released. Changes replicated, sequence numbers in sync. 13 14 15 16
  • 26.
    Witness Server •Sequence #:22 •Cluster Core Resources •Sequence #: 20 •Lock witness.log •Sequence #: 21 •Sequence #: 20 •Cluster Core Resources •Sequence #: 21 22 Cluster state change – node owning cluster core resources unavailable. 0 #mstechdays 1 2 3 4 5 Challenging node attempts witness lock. No lock exists, lock successful, sequence number updated. 6 7 8 9 10 11 Infrastructure, communication & collaboration All nodes available. FSW lock released. Changes replicated, sequence numbers in sync. 12 13 14 15 16
  • 27.
    Witness Server Placement •Exchange 2010 guidance – “We recommend that you use a Hub Transport server running on Exchange Server 2010 in the Active Directory site containing the DAG. This allows the witness server/directory to remain under the control and visibility of an Exchange administrator.” – “If your DAG is extended to multiple datacenters, we recommend deploying the witness server in the datacenter considered to be the primary datacenter.” #mstechdays Infrastructure, communication & collaboration
  • 28.
    Witness Server Placement •Exchange 2013 guidance more complicated due to options introduced by architectural changes • Exchange 2013 includes support for new DAG configuration options – A third location, such as a third physical datacenter or branch office • Ultimately, the placement of a DAG’s witness server depends on your business #mstechdays requirements and the options available to you Infrastructure, communication & collaboration
  • 29.
    Witness Server Placement Deploymentscenario Placement Recommendation Single DAG deployed in one datacenter Locate witness server in the same datacenter as DAG members Single DAG deployed across two datacenters; no additional locations available Locate witness server in primary datacenter Multiple DAGs deployed in one datacenter Locate witness server in the same datacenter as DAG members. Additional options include: • Using the same witness server for multiple DAGs • Using a DAG member to act as a witness server for a different DAG Multiple DAGs deployed across two datacenters Locate witness server in the same datacenter as DAG members. Additional options include: • Using the same witness server for multiple DAGs • Using a DAG member to act as a witness server for a different DAG Single or Multiple DAGs deployed across more than two datacenters Locate the witness server in the datacenter where you want the majority of quorum votes to exist #mstechdays Infrastructure, communication & collaboration
  • 30.
    Witness Server Placement •If your organization has a 3rd location, a witness server can be deployed there for automatic database failover between two other sites – The witness server location must have network infrastructure and connectivity that is isolated from network failures that affect the two datacenters with DAG members • For all DAGs, the availability of the witness server should be on the Exchange #mstechdays Infrastructure, communication & collaboration
  • 31.
    Witness Server Placement •IaaS providers and cloud providers are not supported for use as a witness server – This includes Windows Azure, which does not yet support the required underlying network configuration to allow an Azure file server VM to act as a witness server in a multi-datacenter deployment – More info at http://aka.ms/DAGAzure #mstechdays Infrastructure, communication & collaboration
  • 32.
    Related Content • Exchange2013 Haute disponibilité et tolérance aux sinistres (Session 2/2 première partie) – 12/02/14 - 16:30-17:15, salle 253 • Exchange 2013 Dimensionnement et Performance – 12/02/14 – 17:45-18:30, salle 252B #mstechdays Infrastructure, communication & collaboration
  • 33.