SlideShare a Scribd company logo
Objectives
Understand database replication
Manage a database availability group
Understand Active Manager
Understand site resiliency for Exchange 2013
Overview of Critical Services
Messaging services have the following requirements
◦ Business-critical
◦ Always online with minimal to no data loss
◦ Capable of surviving a variety of failure scenarios
◦ Flexible
◦ Managed during business hours with minimal impact on end users
◦ Fast and secure
Database Availability Groups (DAGs)
Method of providing database resiliency by replicating the active Mailbox database to other
Mailbox servers
DAG configuration includes
◦ Add mailbox servers to the DAG as members
◦ Decide which databases will be replicated to which members
◦ One Mailbox server will have an “active” copy of the database while others store a “passive” copy
◦ If the active database fails a passive copy will become active
◦ Minimal to no interruption of Mailbox service to end users
◦ Business and technical requirements drive the DAG to stretch across multiple datacenters
Database Replication
DAG is the boundary to replicate database content between Mailbox servers
Uses the Microsoft Exchange Replication Service to continuously replicate transaction logs
between the active and passive copies
TCP port 64327 used for replication
Mailbox databases write transactions to memory (log buffer) then to disk (transaction log) and
then commit the transactions to the Mailbox database
Uses file mode and block mode replication
◦ File mode replicates the transaction logs between servers (1 MB in size)
◦ Once file mode is up-to-date block mode begins which replicates the log buffer to passive DAG
members. This minimizes loses in the event of failure.
DAG Requirements
Require clustering components included with Windows Server 2008 R2 Enterprise or Datacenter and Windows Server
2012 Standard and Datacenter
All DAG members must be running the same OS version
Supported by both Exchange 2013 versions, Standard and Enterprise
Maximum of 16 members in a single DAG
Must be members of the same domain
Exchange Mailbox server installed on domain controllers is not support by Microsoft for DAG membership
DAG name has a 15 character limitation
All DAG members should use the same number of NICs and have connectivity to all members
If using multiple NICs they must be on different networks
Round-trip latency limit of 500 milliseconds between members
Requires a non-DAG member to be used as a file share witness
DAG Quorum
DAG availability is based on quorum, which is maintained when a majority of the Mailbox
servers are online
Formula N/2+1 to determine number of servers required to maintain quorum
◦ 7 Mailbox server DAG: 7/2+1 (round down) = 4 DAG members must be active to maintain quorum
Traditionally, once quorum is lost the cluster would be marked as offline and mailbox databases
are dismounted
Windows Server 2012 uses “dynamic quorum” which adjusts the number of servers required to
maintain quorum after a failure making it more resilient
Quorum changes depending on number of nodes
◦ Even number: A Node and File Share Majority (uses a file share witness server)
◦ Odd number: A Node Majority
File Share Witness
A witness server is a domain joined computer that is not part of a DAG that can be used to
maintain quorum when a DAG contains an even number of Mailbox servers
Example:
◦ Two datacenters with 4 servers in each site (8 MB servers) loses its connection between datacenters
◦ Formula: N/2+1 8/2+1=5 servers required to maintain quorum
◦ Without a witness server once the connection is broken neither datacenters continue to function since
they have less than the number required for quorum
The site with the witness server gets an additional “vote” therefore maintaining quorum
Symmetric Database Copies
Microsoft recommendations for using a single volume
◦ Use a single volume for the entire disk
◦ Number of copies of each database should equal the number
of copies per disk
◦ Activation preference should be balanced across DAG members
Note that Windows Server Backup targets the entire
volume
DAG Member Network Interfaces
Microsoft recommends using multiple interfaces on DAG members
Maximum of one NIC used for client connections (MAPI NIC)
◦ Uses a default gateway
◦ Enable File and Printer Sharing for Microsoft Networks
◦ Enable Client for Microsoft Networks
◦ Register in DNS
◦ Highest priority in binding order
One or more NICs for DAG communication
◦ Separate subnet from MAPI NIC
◦ No default gateway
◦ Disable File and Printer Sharing for Microsoft Networks
◦ Disable Client for Microsoft Networks
◦ Do not register in DNS
◦ Do not use NICs configured for iSCSI
Lagged Mailbox Database Copies
A lagged mailbox database copy is a replication partner of an active database that delays
committing transactions to the database for a predetermined period of time (replay lag time)
◦ Replay lag time specifies how long to wait until the transaction log is committed to the mailbox
database
◦ Truncation lag time defines when the transaction log file will be deleted from disk (begins after replay
lag time has completed)
Note:
◦ The duration of the ReplayLagTimes parameter should match the duration of time Safety Net stores
email messages (default is 2 days)
◦ The Safety Net is a message queue retention mechanism used to store a copy of each message
delivered to the active copy of a mailbox database
Automatic Reseed
New to Exchange 2013
Involves pre-mapping volumes and using mount
points to plan an automated reseeding of a failed
database replica back to a healthy state
Active Manager
Active Manager runs on all Mailbox servers inside the Microsoft Exchange Replication Service
and is responsible for managing the active database copies in a DAG during failover
When the Mailbox server is in a DAG there are two Active Manager roles
◦ Primary (PAM)
◦ Held by one DAG member
◦ Has ownership of the cluster quorum resources
◦ Standby Active Manager (SAM)
◦ Available to become primary should the current role holder fail
◦ Notifies the PAM of active local databases
◦ If an active database fails the SAM notifies the PAM to begin a failover to a passive database copy
◦ If the entire server fails, the PAM is already aware of the active databases that were held by that Mailbox server and will begin
failover to a passive copy of the database
Best Copy and Server
Selection (BCSS)
Process used by the PAM Mailbox server for automatic selection of
a new active database during failover/switchover when the
Administrator has not specified a target server
Uses 10 sets of criteria when determining a new active database
If the selected server passes the 10 sets of criteria and the replay
queue length is less than the amount of acceptable logs lost then it
will become the new active database
AutoDatabaseMountDial cmdlet defines the acceptable
number of missing transaction logs
Best Copy and Server Selection (BCSS)
The order used to select a new active database is as follows:
◦ Meets 10 sets of criteria
◦ Copy Queue Length: number of log files waiting to be copied and inspected
◦ Activation Preference: Administrative preference number
◦ Replay Queue Length: number of log files waiting to be replayed into this copy of the database
• MBX4DB1 becomes the Active database
– Meets the first set of criteria (as do the others)
– Lowest Copy Queue Length
Site Resiliency
Managing multi-site Exchange organizations has been simplified in Exchange 2013, no longer do
clients connect to specific namespaces or connect via RPC and instead use HTTP and allow
connections to any CAS server
◦ Exchange 2010
◦ Clients connected to a CAS namespace for a database which effectively made the CAS a point of failure. If a DAG failed over to
another datacenter, the admin had to update the RpcClientAccessServer parameter to update the CAS servers of the new location
Use multiple DNS A records resolving the same name to the IP address of multiple CAS servers in
different sites. If a local CAS server doesn’t respond a remote CAS can proxy the request back to
a local DAG
Page Patching
When the DAG spans multiple sites there greater delay in replaying transaction logs to the
passive database
This can result in divergence between datacenters in the event of failure
Replication service will attempt to update the transaction logs with information from the active
database
If the passive database becomes active without being fully updated there will be a loss in
content contained within the remaining transaction logs
In this event, the Administrator must determine which log files are missing and use a recovery
mailbox database to restore the missing log files and export the content into a PST file
Site Resiliency Scenarios
SINGLE DAG – TWO SITES
Primary datacenter contains the majority of
MB servers
If the MB server number is even then a File
Share Witness is placed at the primary
datacenter
Issues
◦ If end users are located in both sites, failure of
the WAN link will result in loss of email service
for users in the secondary datacenter
◦ Requires manual failover to the secondary
datacenter should the primary fail
Site Resiliency Scenarios
SINGLE DAG – THREE SITES
Microsoft preferred solution
Uses an even number of DAG members across
both sites with the File Share Witness located
in the third datacenter
Benefit is that either datacenter can fail and
quorum is maintained keeping the DAG active
Issues
◦ Failure of the WAN from the secondary
datacenter will result in end users at that
location from accessing email services
Site Resiliency
Scenarios
MULTIPLE DAGS – TWO SITES
Used when the WAN connection doesn’t
support the required throughput required for
continuous replication
Allows mailbox services to be available at both
locations in the event of a WAN link failure by
having local users associated with a local
active DAG
Issues
◦ Requires more servers, storage and support
Patch Management
Cumulative updates are released every 3 months (quarterly).
Cumulative update is a full product. It is possible to install Exchange 2013 from scratch using a
cumulative update download, as well as to upgrade a previous release to the latest software
level.
Cumulative updates may include schema updates and therefore need to be planned considering
the entire forest not just a single server and require Enterprise Admin and Schema Admin
privileges.
21
Maintenance Mode
Used when installing a cumulative update to a server within a DAG. Placing a DAG member in maintenance mode will
move all active databases off the server and blocks any other server from moving a database to this server.
The DAG member is put into Maintenance Mode by using the following commands in EMS:
◦ CD $ExScripts
.StartDAGServerMaintenance.ps1 -Server AMS-EXCH01
When the DAG member is upgraded (and rebooted), it can be put back into normal operation using the following
commands in EMS:
◦ CD $ExScripts
.StopDAGServerMaintenance.ps1 -Server AMS-EXCH01
The last step is to redistribute the mailbox databases across all the DAG members. Again, the
RedistributeActiveDatabases.ps1 script can be found in the $ExScripts directory so you can use the following command in
EMS. This redistributes the active mailbox databases across the DAG based on their activation preference.
◦ CD $ExScripts
.RedistributeActiveDatabases.ps1 -DagName DAG01 –BalanceDbsByActivationPreference
-Confirm:$False
22
References
Sybex, Mastering Microsoft Exchange 2013 by David Elfassy

More Related Content

What's hot

Exchange 2013 Migration & Coexistence
Exchange 2013 Migration & CoexistenceExchange 2013 Migration & Coexistence
Exchange 2013 Migration & Coexistence
Microsoft Technet France
 
XenApp Load Balancing
XenApp Load BalancingXenApp Load Balancing
XenApp Load Balancing
Denis Gundarev
 
Weblogic cluster
Weblogic clusterWeblogic cluster
Weblogic cluster
Aditya Bhuyan
 
24 Hours Of Exchange Server 2007 ( Part 7 Of 24)
24  Hours Of  Exchange  Server 2007 ( Part 7 Of 24)24  Hours Of  Exchange  Server 2007 ( Part 7 Of 24)
24 Hours Of Exchange Server 2007 ( Part 7 Of 24)
guestef2a2b
 
Exl393 exchange 2013 architecture schnoll (rm221)
Exl393 exchange 2013 architecture schnoll (rm221)Exl393 exchange 2013 architecture schnoll (rm221)
Exl393 exchange 2013 architecture schnoll (rm221)
Khalid Al-Ghamdi
 
Mcse 2012
Mcse 2012Mcse 2012
Clustering of Exchnage server
Clustering of Exchnage serverClustering of Exchnage server
Clustering of Exchnage server
Lohit Ahuja
 
Exchange 2010 Poster
Exchange 2010 PosterExchange 2010 Poster
Exchange 2010 Poster
Paulo Freitas
 
10135 b 11
10135 b 1110135 b 11
10135 b 11
Wichien Saisorn
 
TechNet Webcast: Exchange 2010 Outlook Web Access
TechNet Webcast: Exchange 2010 Outlook Web AccessTechNet Webcast: Exchange 2010 Outlook Web Access
TechNet Webcast: Exchange 2010 Outlook Web Access
Microsoft TechNet
 
UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...
UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...
UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...
Louis Göhl
 
Microsoft exchange-server-2013-installation
Microsoft exchange-server-2013-installationMicrosoft exchange-server-2013-installation
Microsoft exchange-server-2013-installation
takdirlovely09
 
clustering and load balancing
clustering and load balancingclustering and load balancing
clustering and load balancing
Prabhat gangwar
 
Exchange 2013 Architecture Poster
Exchange 2013 Architecture PosterExchange 2013 Architecture Poster
Exchange 2013 Architecture Poster
Rian Yulian
 

What's hot (14)

Exchange 2013 Migration & Coexistence
Exchange 2013 Migration & CoexistenceExchange 2013 Migration & Coexistence
Exchange 2013 Migration & Coexistence
 
XenApp Load Balancing
XenApp Load BalancingXenApp Load Balancing
XenApp Load Balancing
 
Weblogic cluster
Weblogic clusterWeblogic cluster
Weblogic cluster
 
24 Hours Of Exchange Server 2007 ( Part 7 Of 24)
24  Hours Of  Exchange  Server 2007 ( Part 7 Of 24)24  Hours Of  Exchange  Server 2007 ( Part 7 Of 24)
24 Hours Of Exchange Server 2007 ( Part 7 Of 24)
 
Exl393 exchange 2013 architecture schnoll (rm221)
Exl393 exchange 2013 architecture schnoll (rm221)Exl393 exchange 2013 architecture schnoll (rm221)
Exl393 exchange 2013 architecture schnoll (rm221)
 
Mcse 2012
Mcse 2012Mcse 2012
Mcse 2012
 
Clustering of Exchnage server
Clustering of Exchnage serverClustering of Exchnage server
Clustering of Exchnage server
 
Exchange 2010 Poster
Exchange 2010 PosterExchange 2010 Poster
Exchange 2010 Poster
 
10135 b 11
10135 b 1110135 b 11
10135 b 11
 
TechNet Webcast: Exchange 2010 Outlook Web Access
TechNet Webcast: Exchange 2010 Outlook Web AccessTechNet Webcast: Exchange 2010 Outlook Web Access
TechNet Webcast: Exchange 2010 Outlook Web Access
 
UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...
UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...
UNC309 - Getting the Most out of Microsoft Exchange Server 2010: Performance ...
 
Microsoft exchange-server-2013-installation
Microsoft exchange-server-2013-installationMicrosoft exchange-server-2013-installation
Microsoft exchange-server-2013-installation
 
clustering and load balancing
clustering and load balancingclustering and load balancing
clustering and load balancing
 
Exchange 2013 Architecture Poster
Exchange 2013 Architecture PosterExchange 2013 Architecture Poster
Exchange 2013 Architecture Poster
 

Similar to Ch05 high availability

Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Microsoft Technet France
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
Saptarshi Chatterjee
 
Scott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceScott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilience
Nordic Infrastructure Conference
 
Google file system
Google file systemGoogle file system
Google file system
Roopesh Jhurani
 
10135 a 07
10135 a 0710135 a 07
10135 a 07
Bố Su
 
How to scale your web app
How to scale your web appHow to scale your web app
How to scale your web app
Georgio_1999
 
Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2
Osama Mohammed
 
Sql Server
Sql ServerSql Server
Sql Server
SandyShin
 
Client server
Client serverClient server
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Lenovo Data Center
 
Database Management System - 2a
Database Management System - 2aDatabase Management System - 2a
Database Management System - 2a
SSN College of Engineering, Kalavakkam
 
10135 b 07
10135 b 0710135 b 07
10135 b 07
Wichien Saisorn
 
New Exchange Server 2013 Architecture
New Exchange Server 2013 ArchitectureNew Exchange Server 2013 Architecture
New Exchange Server 2013 Architecture
Khalid Al-Ghamdi
 
High availability solutions bakostech
High availability solutions bakostechHigh availability solutions bakostech
High availability solutions bakostech
Viktoria Bakos
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
MariaDB plc
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
Ankita Kapratwar
 
Exchange 2010 High Availability And Storage
Exchange 2010 High Availability And StorageExchange 2010 High Availability And Storage
Exchange 2010 High Availability And Storage
Harold Wong
 
Microsoft Exchange Server 2010
Microsoft Exchange Server 2010Microsoft Exchange Server 2010
Microsoft Exchange Server 2010
HCL TECHNOLOGIES
 
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
Gary Jackson MBCS
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
MariaDB plc
 

Similar to Ch05 high availability (20)

Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
Exchange 2013 Haute disponibilité et tolérance aux sinistres (Session 1/2 pre...
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
 
Scott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilienceScott Schnoll - Exchange server 2013 high availability and site resilience
Scott Schnoll - Exchange server 2013 high availability and site resilience
 
Google file system
Google file systemGoogle file system
Google file system
 
10135 a 07
10135 a 0710135 a 07
10135 a 07
 
How to scale your web app
How to scale your web appHow to scale your web app
How to scale your web app
 
Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2Exchang Server 2013 chapter 2
Exchang Server 2013 chapter 2
 
Sql Server
Sql ServerSql Server
Sql Server
 
Client server
Client serverClient server
Client server
 
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
 
Database Management System - 2a
Database Management System - 2aDatabase Management System - 2a
Database Management System - 2a
 
10135 b 07
10135 b 0710135 b 07
10135 b 07
 
New Exchange Server 2013 Architecture
New Exchange Server 2013 ArchitectureNew Exchange Server 2013 Architecture
New Exchange Server 2013 Architecture
 
High availability solutions bakostech
High availability solutions bakostechHigh availability solutions bakostech
High availability solutions bakostech
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 
Exchange 2010 High Availability And Storage
Exchange 2010 High Availability And StorageExchange 2010 High Availability And Storage
Exchange 2010 High Availability And Storage
 
Microsoft Exchange Server 2010
Microsoft Exchange Server 2010Microsoft Exchange Server 2010
Microsoft Exchange Server 2010
 
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
SAP HANA System Replication (HSR) versus SAP Replication Server (SRS)
 
Choosing the right high availability strategy
Choosing the right high availability strategyChoosing the right high availability strategy
Choosing the right high availability strategy
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 

Ch05 high availability

  • 1.
  • 2. Objectives Understand database replication Manage a database availability group Understand Active Manager Understand site resiliency for Exchange 2013
  • 3. Overview of Critical Services Messaging services have the following requirements ◦ Business-critical ◦ Always online with minimal to no data loss ◦ Capable of surviving a variety of failure scenarios ◦ Flexible ◦ Managed during business hours with minimal impact on end users ◦ Fast and secure
  • 4. Database Availability Groups (DAGs) Method of providing database resiliency by replicating the active Mailbox database to other Mailbox servers DAG configuration includes ◦ Add mailbox servers to the DAG as members ◦ Decide which databases will be replicated to which members ◦ One Mailbox server will have an “active” copy of the database while others store a “passive” copy ◦ If the active database fails a passive copy will become active ◦ Minimal to no interruption of Mailbox service to end users ◦ Business and technical requirements drive the DAG to stretch across multiple datacenters
  • 5. Database Replication DAG is the boundary to replicate database content between Mailbox servers Uses the Microsoft Exchange Replication Service to continuously replicate transaction logs between the active and passive copies TCP port 64327 used for replication Mailbox databases write transactions to memory (log buffer) then to disk (transaction log) and then commit the transactions to the Mailbox database Uses file mode and block mode replication ◦ File mode replicates the transaction logs between servers (1 MB in size) ◦ Once file mode is up-to-date block mode begins which replicates the log buffer to passive DAG members. This minimizes loses in the event of failure.
  • 6. DAG Requirements Require clustering components included with Windows Server 2008 R2 Enterprise or Datacenter and Windows Server 2012 Standard and Datacenter All DAG members must be running the same OS version Supported by both Exchange 2013 versions, Standard and Enterprise Maximum of 16 members in a single DAG Must be members of the same domain Exchange Mailbox server installed on domain controllers is not support by Microsoft for DAG membership DAG name has a 15 character limitation All DAG members should use the same number of NICs and have connectivity to all members If using multiple NICs they must be on different networks Round-trip latency limit of 500 milliseconds between members Requires a non-DAG member to be used as a file share witness
  • 7. DAG Quorum DAG availability is based on quorum, which is maintained when a majority of the Mailbox servers are online Formula N/2+1 to determine number of servers required to maintain quorum ◦ 7 Mailbox server DAG: 7/2+1 (round down) = 4 DAG members must be active to maintain quorum Traditionally, once quorum is lost the cluster would be marked as offline and mailbox databases are dismounted Windows Server 2012 uses “dynamic quorum” which adjusts the number of servers required to maintain quorum after a failure making it more resilient Quorum changes depending on number of nodes ◦ Even number: A Node and File Share Majority (uses a file share witness server) ◦ Odd number: A Node Majority
  • 8. File Share Witness A witness server is a domain joined computer that is not part of a DAG that can be used to maintain quorum when a DAG contains an even number of Mailbox servers Example: ◦ Two datacenters with 4 servers in each site (8 MB servers) loses its connection between datacenters ◦ Formula: N/2+1 8/2+1=5 servers required to maintain quorum ◦ Without a witness server once the connection is broken neither datacenters continue to function since they have less than the number required for quorum The site with the witness server gets an additional “vote” therefore maintaining quorum
  • 9. Symmetric Database Copies Microsoft recommendations for using a single volume ◦ Use a single volume for the entire disk ◦ Number of copies of each database should equal the number of copies per disk ◦ Activation preference should be balanced across DAG members Note that Windows Server Backup targets the entire volume
  • 10. DAG Member Network Interfaces Microsoft recommends using multiple interfaces on DAG members Maximum of one NIC used for client connections (MAPI NIC) ◦ Uses a default gateway ◦ Enable File and Printer Sharing for Microsoft Networks ◦ Enable Client for Microsoft Networks ◦ Register in DNS ◦ Highest priority in binding order One or more NICs for DAG communication ◦ Separate subnet from MAPI NIC ◦ No default gateway ◦ Disable File and Printer Sharing for Microsoft Networks ◦ Disable Client for Microsoft Networks ◦ Do not register in DNS ◦ Do not use NICs configured for iSCSI
  • 11. Lagged Mailbox Database Copies A lagged mailbox database copy is a replication partner of an active database that delays committing transactions to the database for a predetermined period of time (replay lag time) ◦ Replay lag time specifies how long to wait until the transaction log is committed to the mailbox database ◦ Truncation lag time defines when the transaction log file will be deleted from disk (begins after replay lag time has completed) Note: ◦ The duration of the ReplayLagTimes parameter should match the duration of time Safety Net stores email messages (default is 2 days) ◦ The Safety Net is a message queue retention mechanism used to store a copy of each message delivered to the active copy of a mailbox database
  • 12. Automatic Reseed New to Exchange 2013 Involves pre-mapping volumes and using mount points to plan an automated reseeding of a failed database replica back to a healthy state
  • 13. Active Manager Active Manager runs on all Mailbox servers inside the Microsoft Exchange Replication Service and is responsible for managing the active database copies in a DAG during failover When the Mailbox server is in a DAG there are two Active Manager roles ◦ Primary (PAM) ◦ Held by one DAG member ◦ Has ownership of the cluster quorum resources ◦ Standby Active Manager (SAM) ◦ Available to become primary should the current role holder fail ◦ Notifies the PAM of active local databases ◦ If an active database fails the SAM notifies the PAM to begin a failover to a passive database copy ◦ If the entire server fails, the PAM is already aware of the active databases that were held by that Mailbox server and will begin failover to a passive copy of the database
  • 14. Best Copy and Server Selection (BCSS) Process used by the PAM Mailbox server for automatic selection of a new active database during failover/switchover when the Administrator has not specified a target server Uses 10 sets of criteria when determining a new active database If the selected server passes the 10 sets of criteria and the replay queue length is less than the amount of acceptable logs lost then it will become the new active database AutoDatabaseMountDial cmdlet defines the acceptable number of missing transaction logs
  • 15. Best Copy and Server Selection (BCSS) The order used to select a new active database is as follows: ◦ Meets 10 sets of criteria ◦ Copy Queue Length: number of log files waiting to be copied and inspected ◦ Activation Preference: Administrative preference number ◦ Replay Queue Length: number of log files waiting to be replayed into this copy of the database • MBX4DB1 becomes the Active database – Meets the first set of criteria (as do the others) – Lowest Copy Queue Length
  • 16. Site Resiliency Managing multi-site Exchange organizations has been simplified in Exchange 2013, no longer do clients connect to specific namespaces or connect via RPC and instead use HTTP and allow connections to any CAS server ◦ Exchange 2010 ◦ Clients connected to a CAS namespace for a database which effectively made the CAS a point of failure. If a DAG failed over to another datacenter, the admin had to update the RpcClientAccessServer parameter to update the CAS servers of the new location Use multiple DNS A records resolving the same name to the IP address of multiple CAS servers in different sites. If a local CAS server doesn’t respond a remote CAS can proxy the request back to a local DAG
  • 17. Page Patching When the DAG spans multiple sites there greater delay in replaying transaction logs to the passive database This can result in divergence between datacenters in the event of failure Replication service will attempt to update the transaction logs with information from the active database If the passive database becomes active without being fully updated there will be a loss in content contained within the remaining transaction logs In this event, the Administrator must determine which log files are missing and use a recovery mailbox database to restore the missing log files and export the content into a PST file
  • 18. Site Resiliency Scenarios SINGLE DAG – TWO SITES Primary datacenter contains the majority of MB servers If the MB server number is even then a File Share Witness is placed at the primary datacenter Issues ◦ If end users are located in both sites, failure of the WAN link will result in loss of email service for users in the secondary datacenter ◦ Requires manual failover to the secondary datacenter should the primary fail
  • 19. Site Resiliency Scenarios SINGLE DAG – THREE SITES Microsoft preferred solution Uses an even number of DAG members across both sites with the File Share Witness located in the third datacenter Benefit is that either datacenter can fail and quorum is maintained keeping the DAG active Issues ◦ Failure of the WAN from the secondary datacenter will result in end users at that location from accessing email services
  • 20. Site Resiliency Scenarios MULTIPLE DAGS – TWO SITES Used when the WAN connection doesn’t support the required throughput required for continuous replication Allows mailbox services to be available at both locations in the event of a WAN link failure by having local users associated with a local active DAG Issues ◦ Requires more servers, storage and support
  • 21. Patch Management Cumulative updates are released every 3 months (quarterly). Cumulative update is a full product. It is possible to install Exchange 2013 from scratch using a cumulative update download, as well as to upgrade a previous release to the latest software level. Cumulative updates may include schema updates and therefore need to be planned considering the entire forest not just a single server and require Enterprise Admin and Schema Admin privileges. 21
  • 22. Maintenance Mode Used when installing a cumulative update to a server within a DAG. Placing a DAG member in maintenance mode will move all active databases off the server and blocks any other server from moving a database to this server. The DAG member is put into Maintenance Mode by using the following commands in EMS: ◦ CD $ExScripts .StartDAGServerMaintenance.ps1 -Server AMS-EXCH01 When the DAG member is upgraded (and rebooted), it can be put back into normal operation using the following commands in EMS: ◦ CD $ExScripts .StopDAGServerMaintenance.ps1 -Server AMS-EXCH01 The last step is to redistribute the mailbox databases across all the DAG members. Again, the RedistributeActiveDatabases.ps1 script can be found in the $ExScripts directory so you can use the following command in EMS. This redistributes the active mailbox databases across the DAG based on their activation preference. ◦ CD $ExScripts .RedistributeActiveDatabases.ps1 -DagName DAG01 –BalanceDbsByActivationPreference -Confirm:$False 22
  • 23. References Sybex, Mastering Microsoft Exchange 2013 by David Elfassy