SlideShare a Scribd company logo
1 of 52
Exchange Server 2013
High Availability | Site Resilience
Scott Schnoll
Principal Technical Writer
Microsoft Corporation
Serveurs / Entreprise / Réseaux / IT
Twitter: Schnoll
Blog: http://aka.ms/Schnoll
• Storage
• High Availability
• Site Resilience
Agenda
STORAGE
• Capacity is increasing, but IOPS are not
• Database sizes must be manageable
• Reseeds must be fast and reliable
• Passive copy IOPS are inefficient
• Lagged copies have asymmetric storage
requirements
• Low agility from low disk space recovery
Storage Challenges
• Multiple Databases Per Volume
• Automatic Reseed
• Automatic Recovery from Storage Failures
• Lagged Copy Enhancements
Storage Enhancements
MULTIPLE DATABASE PER VOLUME
Multiple databases per volume
DB4DB3DB2
PassiveActive Lagged
4-member DAG
4 databases
4 copies of each database
4 databases per volume
Symmetrical design with balanced
activation preference
Number of copies per database =
number of databasesper volume
Multiple databases per volume
DB1DB1DB1
PassiveActive Lagged
Single database copy/disk:
Reseed 2TB Database = ~23 hrs
Reseed 8TB Database = ~93 hrs
20 MB/s
Multiple databases per volume
DB4DB3DB2
PassiveActive Lagged
Single database copy/disk:
Reseed 2TB Database = ~23 hrs
Reseed 8TB Database = ~93 hrs
4 database copies/disk:
Reseed 2TB Disk = ~9.7 hrs
Reseed 8TB Disk = ~39 hrs12 MB/s
12 MB/s
20 MB/s 20 MB/s
• Requirements
– Single logical disk/partition per physical disk
• Recommendations
– Databases per volume should equal the number of
copies per database
– Same neighbors on all servers
– Balance activation preferences
Multiple databases per volume
AUTORESEED
• Disk failure on active copy = database
failover
• Failed disk and database corruption issues
need to be addressed quickly
• Fast recovery to restore redundancy is
needed
Seeding Challenges
• Autoreseed - automatically restore
redundancy after disk failure
Seeding Enhancements
In-Use
Storage
X
Autoreseed
Periodically
scan for failed
and
suspended
copies
Check
prerequisites:
single
copy, spare
availability
Allocate and
remap a spare
Start the seed
Verify that the
new copy is
healthy
Admin
replaces
failed disk
Autoreseed
Configure storage
subsystem with spare disks
Create DAG, add servers
with configured storage
Create directory and
mount points
Configure DAG, including 3
new properties
Create mailbox databases
and database copies
MDB1 MDB2
MDB1 MDB2
MDB1.DB MDB1.log
MDB1.DB MDB1.log
AutoDagDatabasesRootFolderPath
AutoDagVolumesRootFolderPath
AutoDagDatabaseCopiesPerVolume = 1
• Requirements
– Single logical disk/partition per physical disk
– Specific database and log folder structure must be used
• Recommendations
– Same neighbors on all servers
– Databases per volume should equal the number of copies
per database
– Balance activation preferences
• Configuration instructions
– http://aka.ms/autoreseed
Autoreseed
AUTOMATIC RECOVERY FROM STORAGE
FAILURES
• Storage controllers are basically mini-PCs
– As such, they can crash, hang, etc., requiring
administrative intervention
• Other operator-recoverable conditions can
occur
– Loss of vital system elements
– Hung or highly latent IO
Recovery Challenges
• Innovations added in Exchange 2010 carried forward
• New recovery behaviors added to Exchange 2013
– Even more added to Exchange 2013 CU1
Recovery Enhancements
Exchange Server 2010 Exchange Server 2013
ESE Database Hung IO (240s) System Bad State (302s)
Failure Item Channel Heartbeat (30s) Long I/O times (41s)
SystemDisk Heartbeat (120s) MSExchangeRepl.exe memory threshold (4GB)
Exchange Server 2013 CU1
Bus reset (event 129)
Replication service endpoints not responding
LAGGED COPY ENHANCEMENTS
• Activation is difficult
• Lagged copies require manual care
• Lagged copies cannot be page patched
Lagged Copy Challenged
• Automatic log file replay in a variety of
situations
– Low disk space (enable in registry)
– Page patching (enabled by default)
– Less than 3 other healthy copies (enable in AD;
configure in registry)
• Integration with Safety Net
– No need for log surgery or hunting for the point of
corruption
Lagged Copy Enhancements
HIGH AVAILABILITY
• High availability focuses on database health
• Best copy selection insufficient for new
architecture
• Management challenges around
maintenance and DAG network
configuration
High Availability Challenges
• Managed Availability
• Best Copy and Server Selection
• DAG Network Autoconfig
High Availability Enhancements
MANAGED AVAILABILITY
• Key tenet for Exchange 2013:
– All access to a mailbox is provided by the protocol stack on the
Mailbox server that hosts the active copy of the user’s mailbox
• If a protocol is down on a Mailbox server, all active
databases lose access via that protocol
• Managed Availability was introduced to detect these
kinds of failures and automatically correct them
– For most protocols, quick recovery is achieved via a restart action
– If the restart action fails, a failover can be triggered
• Each protocol team designed their own recovery sequence, which
is based on their experiences running Office 365 – service
experience accrues to the on-premises admin!
Managed Availability
• An internal framework used by component
teams
• Sequencing mechanism to control when
recovery actions are taken versus alerting and
escalation
• Enhances the best copy selection algorithm by
taking into account server health
• Includes a mechanism for taking servers in/out
of service (maintenance mode)
Managed Availability
• MA failovers are recovery action from failure
– Detected via a synthetic operation or live data
– Throttled in time and across the DAG
• MA failovers come in two forms
– Server: Protocol failure can trigger server failover
– Database: Store-detected database failure can trigger database
failover
• MA includes Single Copy Alert
– Alert is per-server to reduce flow
– Still triggered across all machines with copies
– Monitoring triggered through a notification
– Logs 4138 (red) and 4139 (green) events
Managed Availability
BEST COPY AND SERVER SELECTION
• Exchange 2010 used several criteria
– Copy queue length
– Replay queue length
– Database copy status – including activation blocked
– Content index status
• Using just this criteria is not good enough
for Exchange 2013, because protocol health
is not considered
Best Copy Selection Challenges
• Still an Active Manager algorithm performed at
*over time based on extracted health of the system
– Replication health still determined by same criteria and phases
– Criteria now includes health of the entire protocol stack
• Considers a prioritized protocol health set in the
selection
– Four priorities – critical, high, medium, low (all health sets have a
priority)
– Failover responders trigger added checks to select a “protocol not
worse” target
Best Copy and Server Selection
Best Copy and Server Selection
All Healthy
Checks for a server hosting a copy that has all health sets in a healthy state
Up to Normal Healthy
Checks for a server hosting a copy that has all health sets Medium and above in a healthy state
All Better than Source
Checks for a server hosting a copy that has health sets in a state that is better than the current
server hosting the affected copy
Same as Source
Checks for a server hosting a copy of the affected database that has health sets in a state that is the same as the
current server hosting the affected copy
DAG NETWORK AUTOCONFIG
• DAG networks must be manually collapsed
in a multi-subnet deployment
• Continuing to reduce administrative burden
for deployment and initial configuration
DAG Network Challenges
• DAGs now default to automatic
configuration
– Still requires specific configuration settings on NICs
– Manual edits and EAC controls blocked when
automatic networking is enabled
– Set DAG to manual network setup to edit or change
DAG networks
• Multi-subnet DAG networks automatically
collapsed
DAG Network Enhancements
DAG Network Enhancements
SITE RESILIENCE
• Operationally complex
• Mailbox and Client Access recovery
connected
• Namespace is a SPOF
Site Resilience Challenges
• Operationally simplified
• Mailbox and Client Access recovery
independent
• Namespace provides redundancy
Site Resilience Enhancements
• Previously loss of CAS, CAS array, VIP, LB,
some portion of the DAG required admin to
perform a datacenter switchover
• In Exchange Server 2013, recovery happens
automatically
– The admin focuses on fixing the issue, instead of
restoring service
Site Resilience – Operationally Simplified
• Previously, CAS and Mailbox server
recovery were tied together in site
recoveries
• In Exchange Server 2013, recovery is
independent, and may come automatically
in the form of failover
Site Resilience – Recovery Independent
• DNS resolves to multiple IP addresses
• Almost all protocol access in Exchange 2013 is
HTTP
• HTTP clients have built-in IP failover capabilities
• Clients skip past IPs that produce hard TCP failures
• Admins can switchover by removing VIP from DNS
• Namespace no longer a SPOF
• No dealing with DNS latency
Site Resilience – Namespace Redundancy
• With the namespace simplification, consolidation of
server roles, separation of CAS array and DAG
recovery, and load balancing changes, three
locations can simplify mailbox recovery and provide
datacenter failovers
• You must have at least three locations
– Two locations with Exchange; one with witness server
– Exchange sites must be well-connected
– Witness server site must be isolated from network failures
affecting Exchange sites
Site Resilience – Three Locations
PortlandRedmond
Site Resilience
cas3 cas4cas1 cas2
PortlandRedmond
Site Resilience
dag1
mbx1 mbx2 mbx3 mbx4
Assuming MBX3 and MBX4 are operating and one of them can lock the witness.log
file, automatic failover should occur
witness
PortlandRedmond
Site Resilience
dag1mbx1 mbx2 mbx3 mbx4
PortlandRedmond
dag1
Site Resilience
mbx1 mbx2 mbx3 mbx4
1. Mark the failed servers/site as down: Stop-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Redmond
2. Stop the Cluster Service on Remaining DAG members: Stop-Clussvc
3. Activate DAG members in 2nd datacenter: Restore-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Portland
SUMMARY
• Many storage enhancements targeted
towards JBOD environments
• Numerous high availability improvements
• Site resilience operationally simplified
Summary
Scott Schnoll
Principal Technical Writer
scott.schnoll@microsoft.com
http://aka.ms/schnoll
schnoll
Questions?
Formez-vous en ligne
Retrouvez nos évènements
Faites-vous accompagner
gratuitement
Essayer gratuitement nos
solutions IT
Retrouver nos experts
Microsoft
Pros de l’ITDéveloppeurs
www.microsoftvirtualacademy.comhttp://aka.ms/generation-app
http://aka.ms/evenements-
developpeurs
http://aka.ms/itcamps-france
Les accélérateurs
Windows Azure, Windows Phone,
Windows 8
http://aka.ms/telechargements
La Dev’Team sur MSDN
http://aka.ms/devteam
L’IT Team sur TechNet
http://aka.ms/itteam

More Related Content

What's hot

Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBMariaDB plc
 
Aceleracion de aplicacione 2
Aceleracion de aplicacione 2Aceleracion de aplicacione 2
Aceleracion de aplicacione 2jfth
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High AvailabilityMariaDB plc
 
Using all of the high availability options in MariaDB
Using all of the high availability options in MariaDBUsing all of the high availability options in MariaDB
Using all of the high availability options in MariaDBMariaDB plc
 
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices SessionNZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices SessionMichael Noel
 
Datasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmrafDatasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmrafMidVision
 
X-DB Replication Server and MMR
X-DB Replication Server and MMRX-DB Replication Server and MMR
X-DB Replication Server and MMRAshnikbiz
 
OpenText Archive Server on Azure
OpenText Archive Server on AzureOpenText Archive Server on Azure
OpenText Archive Server on AzureGary Jackson MBCS
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)MarkTaylorIBM
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA EDB
 
Introduction to failover clustering with sql server
Introduction to failover clustering with sql serverIntroduction to failover clustering with sql server
Introduction to failover clustering with sql serverEduardo Castro
 
Enterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional DatabasesEnterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional DatabasesAshnikbiz
 
Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPostgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPavan Deolasee
 
MCSA Installing & Configuring Windows Server 2012 70-410
MCSA Installing & Configuring Windows Server 2012 70-410MCSA Installing & Configuring Windows Server 2012 70-410
MCSA Installing & Configuring Windows Server 2012 70-410omardabbas
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP PerformanceBIOVIA
 
Continuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data ManagementContinuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data Managementguest2e11e8
 
Linkedin NUS QCon 2009 slides
Linkedin NUS QCon 2009 slidesLinkedin NUS QCon 2009 slides
Linkedin NUS QCon 2009 slidesruslansv
 

What's hot (20)

XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
 
Best Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDBBest Practice for Achieving High Availability in MariaDB
Best Practice for Achieving High Availability in MariaDB
 
Aceleracion de aplicacione 2
Aceleracion de aplicacione 2Aceleracion de aplicacione 2
Aceleracion de aplicacione 2
 
MariaDB High Availability
MariaDB High AvailabilityMariaDB High Availability
MariaDB High Availability
 
Clustering
Clustering Clustering
Clustering
 
Using all of the high availability options in MariaDB
Using all of the high availability options in MariaDBUsing all of the high availability options in MariaDB
Using all of the high availability options in MariaDB
 
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices SessionNZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
NZSPC 2013 - Ultimate SharePoint Infrastructure Best Practices Session
 
Datasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmrafDatasheet weblogic midvisionextensionforibmraf
Datasheet weblogic midvisionextensionforibmraf
 
X-DB Replication Server and MMR
X-DB Replication Server and MMRX-DB Replication Server and MMR
X-DB Replication Server and MMR
 
OpenText Archive Server on Azure
OpenText Archive Server on AzureOpenText Archive Server on Azure
OpenText Archive Server on Azure
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA
 
Introduction to failover clustering with sql server
Introduction to failover clustering with sql serverIntroduction to failover clustering with sql server
Introduction to failover clustering with sql server
 
Enterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional DatabasesEnterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional Databases
 
Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL ClusterPostgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL Cluster
 
MCSA Installing & Configuring Windows Server 2012 70-410
MCSA Installing & Configuring Windows Server 2012 70-410MCSA Installing & Configuring Windows Server 2012 70-410
MCSA Installing & Configuring Windows Server 2012 70-410
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
 
Continuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data ManagementContinuent Tungsten - Scalable Saa S Data Management
Continuent Tungsten - Scalable Saa S Data Management
 
Linkedin NUS QCon 2009 slides
Linkedin NUS QCon 2009 slidesLinkedin NUS QCon 2009 slides
Linkedin NUS QCon 2009 slides
 
MCSA 70-412 Chapter 09
MCSA 70-412 Chapter 09MCSA 70-412 Chapter 09
MCSA 70-412 Chapter 09
 

Similar to Exchange Server 2013 : les mécanismes de haute disponibilité et la redondance / résilience de site

05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-conceptsMuhammad Ahad
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool ManagementBIOVIA
 
MySQL enterprise backup overview
MySQL enterprise backup overviewMySQL enterprise backup overview
MySQL enterprise backup overview郁萍 王
 
MySQL Enterprise Backup
MySQL Enterprise BackupMySQL Enterprise Backup
MySQL Enterprise BackupMario Beck
 
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018 Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018 Antonios Chatzipavlis
 
Presentation cloud control enterprise manager 12c
Presentation   cloud control enterprise manager 12cPresentation   cloud control enterprise manager 12c
Presentation cloud control enterprise manager 12cxKinAnx
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityOSSCube
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera ClusterAbdul Manaf
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMWASdev Community
 
Roman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail UnleashedRoman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail UnleashedMSDEVMTL
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephenSteve Feldman
 
Managing Exchange 2016 - Paul Robichaux
Managing Exchange 2016 - Paul RobichauxManaging Exchange 2016 - Paul Robichaux
Managing Exchange 2016 - Paul RobichauxSummit 7 Systems
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutSander Temme
 
Dr and ha solutions with sql server azure
Dr and ha solutions with sql server azureDr and ha solutions with sql server azure
Dr and ha solutions with sql server azureMSDEVMTL
 
Gfs google-file-system-13331
Gfs google-file-system-13331Gfs google-file-system-13331
Gfs google-file-system-13331Fengchang Xie
 
Dueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixDueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixKyle Hailey
 
MySQL :What's New #GIDS16
MySQL :What's New #GIDS16MySQL :What's New #GIDS16
MySQL :What's New #GIDS16Sanjay Manwani
 

Similar to Exchange Server 2013 : les mécanismes de haute disponibilité et la redondance / résilience de site (20)

Unit 2 oracle9i
Unit 2  oracle9i Unit 2  oracle9i
Unit 2 oracle9i
 
05. performance-concepts
05. performance-concepts05. performance-concepts
05. performance-concepts
 
(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management(ATS4-PLAT08) Server Pool Management
(ATS4-PLAT08) Server Pool Management
 
MySQL enterprise backup overview
MySQL enterprise backup overviewMySQL enterprise backup overview
MySQL enterprise backup overview
 
MySQL Enterprise Backup
MySQL Enterprise BackupMySQL Enterprise Backup
MySQL Enterprise Backup
 
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018 Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
 
Presentation cloud control enterprise manager 12c
Presentation   cloud control enterprise manager 12cPresentation   cloud control enterprise manager 12c
Presentation cloud control enterprise manager 12c
 
Maria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High AvailabilityMaria DB Galera Cluster for High Availability
Maria DB Galera Cluster for High Availability
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Planning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPMPlanning For Catastrophe with IBM WAS and IBM BPM
Planning For Catastrophe with IBM WAS and IBM BPM
 
Roman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail UnleashedRoman Rehak: 24/7 Database Administration + Database Mail Unleashed
Roman Rehak: 24/7 Database Administration + Database Mail Unleashed
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
 
Managing Exchange 2016 - Paul Robichaux
Managing Exchange 2016 - Paul RobichauxManaging Exchange 2016 - Paul Robichaux
Managing Exchange 2016 - Paul Robichaux
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
Dr and ha solutions with sql server azure
Dr and ha solutions with sql server azureDr and ha solutions with sql server azure
Dr and ha solutions with sql server azure
 
Gfs google-file-system-13331
Gfs google-file-system-13331Gfs google-file-system-13331
Gfs google-file-system-13331
 
Dueling duplications RMAN vs Delphix
Dueling duplications RMAN vs DelphixDueling duplications RMAN vs Delphix
Dueling duplications RMAN vs Delphix
 
Database Management System - 2a
Database Management System - 2aDatabase Management System - 2a
Database Management System - 2a
 
MySQL :What's New #GIDS16
MySQL :What's New #GIDS16MySQL :What's New #GIDS16
MySQL :What's New #GIDS16
 

More from Microsoft Technet France

Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex
Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex
Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex Microsoft Technet France
 
Comment réussir votre déploiement de Windows 10
Comment réussir votre déploiement de Windows 10Comment réussir votre déploiement de Windows 10
Comment réussir votre déploiement de Windows 10Microsoft Technet France
 
Fusion, Acquisition - Optimisez la migration et la continuité des outils col...
 Fusion, Acquisition - Optimisez la migration et la continuité des outils col... Fusion, Acquisition - Optimisez la migration et la continuité des outils col...
Fusion, Acquisition - Optimisez la migration et la continuité des outils col...Microsoft Technet France
 
Wavestone déploie son portail Powell 365 en 5 semaines
Wavestone déploie son portail Powell 365 en 5 semainesWavestone déploie son portail Powell 365 en 5 semaines
Wavestone déploie son portail Powell 365 en 5 semainesMicrosoft Technet France
 
Retour d’expérience sur le monitoring et la sécurisation des identités Azure
Retour d’expérience sur le monitoring et la sécurisation des identités AzureRetour d’expérience sur le monitoring et la sécurisation des identités Azure
Retour d’expérience sur le monitoring et la sécurisation des identités AzureMicrosoft Technet France
 
Scénarios de mobilité couverts par Enterprise Mobility + Security
Scénarios de mobilité couverts par Enterprise Mobility + SecurityScénarios de mobilité couverts par Enterprise Mobility + Security
Scénarios de mobilité couverts par Enterprise Mobility + SecurityMicrosoft Technet France
 
SharePoint Framework : le développement SharePoint nouvelle génération
SharePoint Framework : le développement SharePoint nouvelle générationSharePoint Framework : le développement SharePoint nouvelle génération
SharePoint Framework : le développement SharePoint nouvelle générationMicrosoft Technet France
 
Stockage Cloud : il y en aura pour tout le monde
Stockage Cloud : il y en aura pour tout le mondeStockage Cloud : il y en aura pour tout le monde
Stockage Cloud : il y en aura pour tout le mondeMicrosoft Technet France
 
Bien appréhender le concept de Windows As a Service
Bien appréhender le concept de Windows As a ServiceBien appréhender le concept de Windows As a Service
Bien appréhender le concept de Windows As a ServiceMicrosoft Technet France
 
Protéger vos données avec le chiffrement dans Azure et Office 365
Protéger vos données avec le chiffrement dans Azure et Office 365Protéger vos données avec le chiffrement dans Azure et Office 365
Protéger vos données avec le chiffrement dans Azure et Office 365Microsoft Technet France
 
Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...
Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...
Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...Microsoft Technet France
 
Comprendre la stratégie identité de Microsoft
Comprendre la stratégie identité de MicrosoftComprendre la stratégie identité de Microsoft
Comprendre la stratégie identité de MicrosoftMicrosoft Technet France
 
Vous avez dit « authentification sans mot de passe » : une illustration avec ...
Vous avez dit « authentification sans mot de passe » : une illustration avec ...Vous avez dit « authentification sans mot de passe » : une illustration avec ...
Vous avez dit « authentification sans mot de passe » : une illustration avec ...Microsoft Technet France
 
Déploiement hybride, la téléphonie dans le cloud
Déploiement hybride, la téléphonie dans le cloudDéploiement hybride, la téléphonie dans le cloud
Déploiement hybride, la téléphonie dans le cloudMicrosoft Technet France
 
Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...
Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...
Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...Microsoft Technet France
 
SharePoint 2016 : architecture, déploiement et topologies hybrides
SharePoint 2016 : architecture, déploiement et topologies hybridesSharePoint 2016 : architecture, déploiement et topologies hybrides
SharePoint 2016 : architecture, déploiement et topologies hybridesMicrosoft Technet France
 
Gestion de Windows 10 et des applications dans l'entreprise moderne
Gestion de Windows 10 et des applications dans l'entreprise moderneGestion de Windows 10 et des applications dans l'entreprise moderne
Gestion de Windows 10 et des applications dans l'entreprise moderneMicrosoft Technet France
 
Office 365 dans votre Système d'Informations
Office 365 dans votre Système d'InformationsOffice 365 dans votre Système d'Informations
Office 365 dans votre Système d'InformationsMicrosoft Technet France
 

More from Microsoft Technet France (20)

Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex
Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex
Automatisez, visualisez et améliorez vos processus d’entreprise avec Nintex
 
Comment réussir votre déploiement de Windows 10
Comment réussir votre déploiement de Windows 10Comment réussir votre déploiement de Windows 10
Comment réussir votre déploiement de Windows 10
 
OMS log search au quotidien
OMS log search au quotidienOMS log search au quotidien
OMS log search au quotidien
 
Fusion, Acquisition - Optimisez la migration et la continuité des outils col...
 Fusion, Acquisition - Optimisez la migration et la continuité des outils col... Fusion, Acquisition - Optimisez la migration et la continuité des outils col...
Fusion, Acquisition - Optimisez la migration et la continuité des outils col...
 
Wavestone déploie son portail Powell 365 en 5 semaines
Wavestone déploie son portail Powell 365 en 5 semainesWavestone déploie son portail Powell 365 en 5 semaines
Wavestone déploie son portail Powell 365 en 5 semaines
 
Retour d’expérience sur le monitoring et la sécurisation des identités Azure
Retour d’expérience sur le monitoring et la sécurisation des identités AzureRetour d’expérience sur le monitoring et la sécurisation des identités Azure
Retour d’expérience sur le monitoring et la sécurisation des identités Azure
 
Scénarios de mobilité couverts par Enterprise Mobility + Security
Scénarios de mobilité couverts par Enterprise Mobility + SecurityScénarios de mobilité couverts par Enterprise Mobility + Security
Scénarios de mobilité couverts par Enterprise Mobility + Security
 
SharePoint Framework : le développement SharePoint nouvelle génération
SharePoint Framework : le développement SharePoint nouvelle générationSharePoint Framework : le développement SharePoint nouvelle génération
SharePoint Framework : le développement SharePoint nouvelle génération
 
Stockage Cloud : il y en aura pour tout le monde
Stockage Cloud : il y en aura pour tout le mondeStockage Cloud : il y en aura pour tout le monde
Stockage Cloud : il y en aura pour tout le monde
 
Bien appréhender le concept de Windows As a Service
Bien appréhender le concept de Windows As a ServiceBien appréhender le concept de Windows As a Service
Bien appréhender le concept de Windows As a Service
 
Protéger vos données avec le chiffrement dans Azure et Office 365
Protéger vos données avec le chiffrement dans Azure et Office 365Protéger vos données avec le chiffrement dans Azure et Office 365
Protéger vos données avec le chiffrement dans Azure et Office 365
 
Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...
Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...
Protéger votre patrimoine informationnel dans un monde hybride avec Azure Inf...
 
Comprendre la stratégie identité de Microsoft
Comprendre la stratégie identité de MicrosoftComprendre la stratégie identité de Microsoft
Comprendre la stratégie identité de Microsoft
 
Vous avez dit « authentification sans mot de passe » : une illustration avec ...
Vous avez dit « authentification sans mot de passe » : une illustration avec ...Vous avez dit « authentification sans mot de passe » : une illustration avec ...
Vous avez dit « authentification sans mot de passe » : une illustration avec ...
 
Sécurité des données
Sécurité des donnéesSécurité des données
Sécurité des données
 
Déploiement hybride, la téléphonie dans le cloud
Déploiement hybride, la téléphonie dans le cloudDéploiement hybride, la téléphonie dans le cloud
Déploiement hybride, la téléphonie dans le cloud
 
Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...
Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...
Supervisez la qualité des appels Skype for Business Online à l'aide de Call Q...
 
SharePoint 2016 : architecture, déploiement et topologies hybrides
SharePoint 2016 : architecture, déploiement et topologies hybridesSharePoint 2016 : architecture, déploiement et topologies hybrides
SharePoint 2016 : architecture, déploiement et topologies hybrides
 
Gestion de Windows 10 et des applications dans l'entreprise moderne
Gestion de Windows 10 et des applications dans l'entreprise moderneGestion de Windows 10 et des applications dans l'entreprise moderne
Gestion de Windows 10 et des applications dans l'entreprise moderne
 
Office 365 dans votre Système d'Informations
Office 365 dans votre Système d'InformationsOffice 365 dans votre Système d'Informations
Office 365 dans votre Système d'Informations
 

Recently uploaded

Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

Exchange Server 2013 : les mécanismes de haute disponibilité et la redondance / résilience de site

  • 1. Exchange Server 2013 High Availability | Site Resilience Scott Schnoll Principal Technical Writer Microsoft Corporation Serveurs / Entreprise / Réseaux / IT Twitter: Schnoll Blog: http://aka.ms/Schnoll
  • 2. • Storage • High Availability • Site Resilience Agenda
  • 4. • Capacity is increasing, but IOPS are not • Database sizes must be manageable • Reseeds must be fast and reliable • Passive copy IOPS are inefficient • Lagged copies have asymmetric storage requirements • Low agility from low disk space recovery Storage Challenges
  • 5. • Multiple Databases Per Volume • Automatic Reseed • Automatic Recovery from Storage Failures • Lagged Copy Enhancements Storage Enhancements
  • 7. Multiple databases per volume DB4DB3DB2 PassiveActive Lagged 4-member DAG 4 databases 4 copies of each database 4 databases per volume Symmetrical design with balanced activation preference Number of copies per database = number of databasesper volume
  • 8. Multiple databases per volume DB1DB1DB1 PassiveActive Lagged Single database copy/disk: Reseed 2TB Database = ~23 hrs Reseed 8TB Database = ~93 hrs 20 MB/s
  • 9. Multiple databases per volume DB4DB3DB2 PassiveActive Lagged Single database copy/disk: Reseed 2TB Database = ~23 hrs Reseed 8TB Database = ~93 hrs 4 database copies/disk: Reseed 2TB Disk = ~9.7 hrs Reseed 8TB Disk = ~39 hrs12 MB/s 12 MB/s 20 MB/s 20 MB/s
  • 10. • Requirements – Single logical disk/partition per physical disk • Recommendations – Databases per volume should equal the number of copies per database – Same neighbors on all servers – Balance activation preferences Multiple databases per volume
  • 12. • Disk failure on active copy = database failover • Failed disk and database corruption issues need to be addressed quickly • Fast recovery to restore redundancy is needed Seeding Challenges
  • 13. • Autoreseed - automatically restore redundancy after disk failure Seeding Enhancements In-Use Storage X
  • 14. Autoreseed Periodically scan for failed and suspended copies Check prerequisites: single copy, spare availability Allocate and remap a spare Start the seed Verify that the new copy is healthy Admin replaces failed disk
  • 15. Autoreseed Configure storage subsystem with spare disks Create DAG, add servers with configured storage Create directory and mount points Configure DAG, including 3 new properties Create mailbox databases and database copies MDB1 MDB2 MDB1 MDB2 MDB1.DB MDB1.log MDB1.DB MDB1.log AutoDagDatabasesRootFolderPath AutoDagVolumesRootFolderPath AutoDagDatabaseCopiesPerVolume = 1
  • 16. • Requirements – Single logical disk/partition per physical disk – Specific database and log folder structure must be used • Recommendations – Same neighbors on all servers – Databases per volume should equal the number of copies per database – Balance activation preferences • Configuration instructions – http://aka.ms/autoreseed Autoreseed
  • 17. AUTOMATIC RECOVERY FROM STORAGE FAILURES
  • 18. • Storage controllers are basically mini-PCs – As such, they can crash, hang, etc., requiring administrative intervention • Other operator-recoverable conditions can occur – Loss of vital system elements – Hung or highly latent IO Recovery Challenges
  • 19. • Innovations added in Exchange 2010 carried forward • New recovery behaviors added to Exchange 2013 – Even more added to Exchange 2013 CU1 Recovery Enhancements Exchange Server 2010 Exchange Server 2013 ESE Database Hung IO (240s) System Bad State (302s) Failure Item Channel Heartbeat (30s) Long I/O times (41s) SystemDisk Heartbeat (120s) MSExchangeRepl.exe memory threshold (4GB) Exchange Server 2013 CU1 Bus reset (event 129) Replication service endpoints not responding
  • 21. • Activation is difficult • Lagged copies require manual care • Lagged copies cannot be page patched Lagged Copy Challenged
  • 22. • Automatic log file replay in a variety of situations – Low disk space (enable in registry) – Page patching (enabled by default) – Less than 3 other healthy copies (enable in AD; configure in registry) • Integration with Safety Net – No need for log surgery or hunting for the point of corruption Lagged Copy Enhancements
  • 24. • High availability focuses on database health • Best copy selection insufficient for new architecture • Management challenges around maintenance and DAG network configuration High Availability Challenges
  • 25. • Managed Availability • Best Copy and Server Selection • DAG Network Autoconfig High Availability Enhancements
  • 27. • Key tenet for Exchange 2013: – All access to a mailbox is provided by the protocol stack on the Mailbox server that hosts the active copy of the user’s mailbox • If a protocol is down on a Mailbox server, all active databases lose access via that protocol • Managed Availability was introduced to detect these kinds of failures and automatically correct them – For most protocols, quick recovery is achieved via a restart action – If the restart action fails, a failover can be triggered • Each protocol team designed their own recovery sequence, which is based on their experiences running Office 365 – service experience accrues to the on-premises admin! Managed Availability
  • 28. • An internal framework used by component teams • Sequencing mechanism to control when recovery actions are taken versus alerting and escalation • Enhances the best copy selection algorithm by taking into account server health • Includes a mechanism for taking servers in/out of service (maintenance mode) Managed Availability
  • 29. • MA failovers are recovery action from failure – Detected via a synthetic operation or live data – Throttled in time and across the DAG • MA failovers come in two forms – Server: Protocol failure can trigger server failover – Database: Store-detected database failure can trigger database failover • MA includes Single Copy Alert – Alert is per-server to reduce flow – Still triggered across all machines with copies – Monitoring triggered through a notification – Logs 4138 (red) and 4139 (green) events Managed Availability
  • 30. BEST COPY AND SERVER SELECTION
  • 31. • Exchange 2010 used several criteria – Copy queue length – Replay queue length – Database copy status – including activation blocked – Content index status • Using just this criteria is not good enough for Exchange 2013, because protocol health is not considered Best Copy Selection Challenges
  • 32. • Still an Active Manager algorithm performed at *over time based on extracted health of the system – Replication health still determined by same criteria and phases – Criteria now includes health of the entire protocol stack • Considers a prioritized protocol health set in the selection – Four priorities – critical, high, medium, low (all health sets have a priority) – Failover responders trigger added checks to select a “protocol not worse” target Best Copy and Server Selection
  • 33. Best Copy and Server Selection All Healthy Checks for a server hosting a copy that has all health sets in a healthy state Up to Normal Healthy Checks for a server hosting a copy that has all health sets Medium and above in a healthy state All Better than Source Checks for a server hosting a copy that has health sets in a state that is better than the current server hosting the affected copy Same as Source Checks for a server hosting a copy of the affected database that has health sets in a state that is the same as the current server hosting the affected copy
  • 35. • DAG networks must be manually collapsed in a multi-subnet deployment • Continuing to reduce administrative burden for deployment and initial configuration DAG Network Challenges
  • 36. • DAGs now default to automatic configuration – Still requires specific configuration settings on NICs – Manual edits and EAC controls blocked when automatic networking is enabled – Set DAG to manual network setup to edit or change DAG networks • Multi-subnet DAG networks automatically collapsed DAG Network Enhancements
  • 39. • Operationally complex • Mailbox and Client Access recovery connected • Namespace is a SPOF Site Resilience Challenges
  • 40. • Operationally simplified • Mailbox and Client Access recovery independent • Namespace provides redundancy Site Resilience Enhancements
  • 41. • Previously loss of CAS, CAS array, VIP, LB, some portion of the DAG required admin to perform a datacenter switchover • In Exchange Server 2013, recovery happens automatically – The admin focuses on fixing the issue, instead of restoring service Site Resilience – Operationally Simplified
  • 42. • Previously, CAS and Mailbox server recovery were tied together in site recoveries • In Exchange Server 2013, recovery is independent, and may come automatically in the form of failover Site Resilience – Recovery Independent
  • 43. • DNS resolves to multiple IP addresses • Almost all protocol access in Exchange 2013 is HTTP • HTTP clients have built-in IP failover capabilities • Clients skip past IPs that produce hard TCP failures • Admins can switchover by removing VIP from DNS • Namespace no longer a SPOF • No dealing with DNS latency Site Resilience – Namespace Redundancy
  • 44. • With the namespace simplification, consolidation of server roles, separation of CAS array and DAG recovery, and load balancing changes, three locations can simplify mailbox recovery and provide datacenter failovers • You must have at least three locations – Two locations with Exchange; one with witness server – Exchange sites must be well-connected – Witness server site must be isolated from network failures affecting Exchange sites Site Resilience – Three Locations
  • 46. PortlandRedmond Site Resilience dag1 mbx1 mbx2 mbx3 mbx4 Assuming MBX3 and MBX4 are operating and one of them can lock the witness.log file, automatic failover should occur witness
  • 48. PortlandRedmond dag1 Site Resilience mbx1 mbx2 mbx3 mbx4 1. Mark the failed servers/site as down: Stop-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Redmond 2. Stop the Cluster Service on Remaining DAG members: Stop-Clussvc 3. Activate DAG members in 2nd datacenter: Restore-DatabaseAvailabilityGroup DAG1 –ActiveDirectorySite:Portland
  • 50. • Many storage enhancements targeted towards JBOD environments • Numerous high availability improvements • Site resilience operationally simplified Summary
  • 51. Scott Schnoll Principal Technical Writer scott.schnoll@microsoft.com http://aka.ms/schnoll schnoll Questions?
  • 52. Formez-vous en ligne Retrouvez nos évènements Faites-vous accompagner gratuitement Essayer gratuitement nos solutions IT Retrouver nos experts Microsoft Pros de l’ITDéveloppeurs www.microsoftvirtualacademy.comhttp://aka.ms/generation-app http://aka.ms/evenements- developpeurs http://aka.ms/itcamps-france Les accélérateurs Windows Azure, Windows Phone, Windows 8 http://aka.ms/telechargements La Dev’Team sur MSDN http://aka.ms/devteam L’IT Team sur TechNet http://aka.ms/itteam

Editor's Notes

  1. Intro Serveurs / Entreprise / Reseaux / IT
  2. See bug 2527095 for the event 129 logic.Example event 129:Log Name: SystemSource: HpCISSs2Date: 1/28/2012 6:55:34 PMEvent ID: 129Task Category: NoneLevel: WarningKeywords: ClassicUser: N/AComputer: DB3PRD0104MB115.eurprd01.prod.exchangelabs.comDescription:Reset to device, \\Device\\RaidPort0, was issued.
  3. Single copy alert (aka database redundancy check) is now integrated with the Replication service.