UNC307 - Microsoft Exchange Server 2010 High Availability
Upcoming SlideShare
Loading in...5
×
 

UNC307 - Microsoft Exchange Server 2010 High Availability

on

  • 3,911 views

Welcome to the future! The future of Exchange high availability, that is. In this session we reveal the changes and improvements to the built-in high availability platform in Exchange Server 2010. ...

Welcome to the future! The future of Exchange high availability, that is. In this session we reveal the changes and improvements to the built-in high availability platform in Exchange Server 2010. Exchange 2010 includes a unified solution for high availability and disaster recovery that is quick to deploy and easy to manage. Learn about all of the new features in Exchange 2010 that make it the most resilient, highly available version of Exchange ever.

Statistics

Views

Total Views
3,911
Views on SlideShare
3,873
Embed Views
38

Actions

Likes
2
Downloads
255
Comments
0

3 Embeds 38

http://www.slideshare.net 30
http://www.techgig.com 7
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    UNC307 - Microsoft Exchange Server 2010 High Availability UNC307 - Microsoft Exchange Server 2010 High Availability Presentation Transcript

    • High Availability
      Scott Schnoll
      Principal Technical Writer
      Microsoft Corporation
      Session Code: UNC307
    • Agenda
      Exchange 2010 High Availability Vision/Goals
      Exchange 2010 High Availability Features
      Exchange 2010 High Availability Deep Dive
      Deploying Exchange 2010 High Availability Features
      Transitioning to Exchange 2010 High Availability
      High Availability Design Examples
    • Exchange 2010 High Availability Vision/Goals
    • Exchange 2010 High Availability Vision and Goals
      Vision: Deliver a fast, easy-to-deploy and operate, economical solution that can provide messaging service continuity for all customers
      Goals
      Deliver a native solution for high availability/site resilience
      Enable less expensive and less complex storage
      Simplify administration and reduce support costs
      Increase end-to-end availability
      Support Exchange Server 2010 Online
      Support large mailboxes at low cost
    • Complex site resilience and recovery
      Dallas
      DB1
      Outlook
      OWA, ActiveSync, or Outlook Anywhere
      DB2
      Standby Cluster
      DB3
      Clustered Mailbox Server had to be created manually
      San Jose
      Front End Server
      Third-party data replication needed for site resilience
      NodeB(passive)
      NodeA(active)
      Clustering knowledge required
      Failover at Mailbox server level
      DB1
      DB4
      DB2
      DB5
      DB3
      DB6
      Exchange Server 2003
    • Complex activation for remote server / datacenter
      Dallas
      DB1
      SCR
      Outlook
      OWA, ActiveSync, or Outlook Anywhere
      DB2
      Standby Cluster
      DB3
      Clustered Mailbox Server can’t co-exist with other roles
      San Jose
      Client Access Server
      No GUI to manage SCR
      NodeB(passive)
      NodeA(active)
      CCR
      Clustering knowledge required
      DB1
      DB4
      DB1
      DB4
      DB2
      DB2
      DB5
      DB5
      Failover at Mailbox server level
      DB3
      DB3
      DB6
      DB6
      Exchange Server 2007
    • Dallas
      All clients connect via CAS servers
      DB1
      DB3
      Client
      DB5
      Mailbox Server 6
      San Jose
      Easy to extend across sites
      Client Access Server
      Failover managed by/with Exchange
      Mailbox Server 1
      Mailbox Server 2
      Mailbox Server 3
      Mailbox Server 4
      Mailbox Server 5
      DB1
      DB4
      DB1
      DB5
      DB3
      DB2
      Database level failover
      DB5
      DB2
      DB1
      DB4
      DB3
      DB3
      DB1
      DB2
      DB4
      DB5
      Exchange Server 2010
    • Exchange 2010 High Availability Features
    • Exchange 2010 High Availability Terminology
      High Availability – Solution must provide data availability, service availability, and automatic recovery from failures
      Disaster Recovery – Process used to manually recover from a failure
      Site Resilience – Disaster recovery solution used for recovery from site failure
      *over – Short for switchover/failover; a switchover is a manual activation of one or more databases; a failover is an automatic activation of one or more databases after a failure
    • Exchange 2010 High Availability Feature Names
      Mailbox Resiliency – Name of Unified High Availability and Site Resilience Solution
      Database Mobility – The ability of a single mailbox database to be replicated to and mounted on other mailbox servers
      Incremental Deployment – The ability to deploy high availability /site resilience after Exchange is installed
      Exchange Third Party Replication API – An Exchange-provided API that enables use of third-party replication for a DAG in lieu of continuous replication
    • Exchange 2010 High Availability Feature Names
      Database Availability Group – A group of up to 16 Mailbox servers that host a set of replicated databases
      Mailbox Database Copy – A mailbox database (.edb file and logs) that is either active or passive
      RPC Client Access service – A Client Access server feature that provides a MAPI endpoint for Outlook clients
      Shadow Redundancy – A transport feature that provides redundancy for messages for the entire time they are in transit
    • Exchange 2010 *overs
      Within a datacenter
      Database or server *overs
      Datacenter level: switchover
      Between datacenters
      Database or server *overs
      Assumptions:
      Each datacenter is a separate Active Directory site
      Each datacenter has live, active messaging services
      Standby datacenter must be active to support single database *over
    • Exchange 2007 Concepts Brought Forward
      Extensible Storage Engine (ESE)
      Databases and log files
      Continuous Replication
      Log shipping and replay
      Database seeding
      Store service/Replication service
      Database health and status monitoring
      Divergence
      Automatic database mount behavior
      Concepts of quorum and witness
      Concepts of *overs
    • Exchange 2010 Cut Concepts
      Storage Groups
      Databases identified by the server on which they live
      Server names as part of database names
      Clustered Mailbox Servers
      Pre-installing a Windows Failover Cluster
      Running Setup in Clustered Mode
      Moving a CMS network identity between servers
      Shared Storage
      Two HA Copy Limits
      Requirement of Two Networks
      Concepts of public, private and mixed networks
    • HA/Backup Strategy Changes
      Exchange 2010
      Feature Set
      Feature Benefits
      HW/SW Failures
      Mailbox Resiliency
      Fast Recovery
      • Fast recovery
      • Data redundancy
      Data Center Failures
      Single Item Recovery
      Accidentally Deleted Items
      • Guaranteed item retention
      Administrator Error
      Data Retention
      Lagged Copy
      • Past point-in-time DB copy
      Mailbox Corruption
      Personal Archive + Retention Policies
      Long Term
      Data Retention
      • Alternate mailbox for older data
    • Exchange 2010 High Availability Deep Dive
    • Exchange 2010 HA Fundamentals
      Database Availability Group
      Server
      Database
      Database Copy
      Active Manager
      RPC Client Access
      RPC CAS
      SVR
      DB
      DB
      copy
      copy
      copy
      copy
      AM
      AM
      SVR
      DAG
      RPC CAS
    • Database Availability Group (DAG)
      Base component of high availability and site resilience
      A group of up to 16 servers that host a set of replicated databases
      “Wraps” a Windows Failover Cluster
      Manages membership (DAG member = node)
      Provides heartbeat of DAG member servers
      Active Manager stores data in cluster database
      Defines a boundary for:
      Mailbox database replication
      Database and server *overs
      Active Manager
    • DAG Requirements
      Windows Server 2008 SP2 Enterprise Edition or Windows Server 2008 R2 Enterprise Edition
      Exchange Server 2010 Standard Edition or Exchange Server 2010 Enterprise Edition
      Standard supports up to 5 databases per server
      Enterprise supports up to 100 databases per server
      At least one network card per DAG member
    • Active Manager
      Exchange component that manages *overs
      Runs on every server in the DAG
      Selects best available copy on failovers
      Is the definitive source of information on where a database is active
      Stores this information in cluster database
      Provides this information to other Exchange components (e.g., RPC Client Access and Hub Transport)
      Two Active Manager roles: PAM and SAM
      Active Manager client runs on CAS and Hub
    • Active Manager
      Primary Active Manager (PAM)
      Runs on the node that owns the cluster group
      Gets topology change notifications
      Reacts to server failures
      Selects the best database copy on *overs
      Standby Active Manager (SAM)
      Runs on every other node in the DAG
      Responds to queries about which server hosts the active copy of the mailbox database
      Both roles are necessary for automatic recovery
      If Replication service is stopped, automatic recovery will not happen
    • Active ManagerSelection of Active Database Copy
      Active Manager selects the “best” copy to become active when existing active fails
      Ignores servers that are unreachable or activation is temporarily or regularly blocked
      Sorts copies by currency to minimize data loss
      Breaks ties during sort based on Activation Preference
      Selects from sorted listed based on copy status of each copy
    • Active ManagerSelection of Active Database Copy
      Active Manager selects the “best” copy to become active when existing active fails
      8
      6
      9
      5
      7
      10
      Catalog Crawling
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      CopyQueueLength < 10
      Catalog Healthy
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      Catalog Crawling
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      Catalog Healthy
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      CopyQueueLength < 10
      ReplayQueueLength < 50
      Catalog Crawling
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      CopyQueueLength < 10
      ReplayQueueLength < 50
      Catalog Healthy
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      ReplayQueueLength < 50
      Catalog Crawling
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      ReplayQueueLength < 50
      Catalog Healthy
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      CopyQueueLength < 10
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
      ReplayQueueLength < 50
      Copy status Healthy, DisconnectedAndHealthy,DisconnectedAndResynchronizing, orSeedingSource
    • Automatic Recovery Process
      When a failure occurs that affects a database:
      Active Manager determines the best copy to activate
      The Replication service on the target server attempts to copy missing log files from the source (ACLL)
      If successful, then the database will mount with zero data loss
      If unsuccessful (lossy failure), then the database will mount based on the AutoDatabaseMountDial setting
      The mounted database will generate new log files (using the same log generation sequence)
      Transport Dumpster requests will be initiated for the mounted database to recover lost messages
      When original server or database recovers, it will run through divergence detection and either perform an incremental resync or require a full reseed
    • Example: Database Failover
      Database failure occurs
      Failure item is raised
      Active Manager moves active database
      Database copy is restored
      Similar flow within and across datacenters
      DAG
      Mailbox Server 1
      Mailbox Server 2
      Mailbox Server 3
      Mailbox Server 4
      Mailbox Server 5
      DB3
      DB2
      DB4
      DB3
      DB4
      DB1
      DB5
      DB4
      DB5
      DB5
      DB2
      DB1
      DB3
      DB1
      DB2
    • Example: Server Failover
      Server failure occurs
      Cluster notification of node down
      Active Manager moves active databases
      Server is restored
      Cluster notification of node up
      Database copies resynchronize with active databases
      Similar flow within and across datacenters
      DAG
      Mailbox Server 1
      Mailbox Server 2
      Mailbox Server 3
      Mailbox Server 4
      Mailbox Server 5
      DB3
      DB2
      DB4
      DB3
      DB4
      DB1
      DB5
      DB4
      DB5
      DB5
      DB2
      DB1
      DB3
      DB1
      DB2
    • Example: RCA service and AM
      Outlook tries to reconnect
      Outlook tries again
      Outlook1
      Outlook3
      Outlook2
      CAS Array
      Load Balancer
      RPC Client Access Server
      RPC Client Access Server
      RPC Client Access Server
      Active Manager Client
      Active Manager Client
      Active Manager Client
      CAS1
      CAS2
      CAS3
      Active Manager Returns Mailbox Server1
      Outlook’s reconnect triggers new AM request
      If failover is in progress AM returns old server & connect fails
      DB failover is complete & AM returns new server
      Disk Fails
      CAS Fails
      Where’s the DB mounted?
      DAG
      MAPI RPC
      Active Manager
      MAPI RPC
      Active Manager
      MAPI RPC
      Active Manager
      MAPI RPC
      Active Manager
      Store
      Store
      Store
      Store
      Mailbox
      Server1
      Mailbox
      Server2
      Mailbox
      Server3
      Mailbox
      Server4
    • DAG Lifecycle
      DAG is created initially as empty object in Active Directory
      Continuous replication or 3rd party replication using Third Party Replication mode
      DAG is given a name and one or more IP addresses (or configured to use DHCP)
      When first Mailbox server is added to a DAG
      A Windows failover cluster is formed with a Node Majority quorum using the name of the DAG
      The server is added to the DAG object in Active Directory
      A cluster network object (CNO) for the DAG is created in the built-in Computers container
      The Name and IP address of the DAG is registered in DNS
      The cluster database for the DAG is updated with info on configured databases, including if they are locally active (which they should be)
    • DAG Lifecycle
      When second and subsequent Mailbox server is added to a DAG
      The server is joined to cluster for the DAG
      The quorum model is automatically adjusted
      Node Majority - DAGs with odd number of members
      Node and File Share Majority - DAGs with even number of members
      File share witness cluster resource, directory, and share are automatically created by Exchange when needed
      The server is added to the DAG object in Active Directory
      The cluster database for the DAG is updated with info on configured databases, including if they are locally active (which they should be)
    • DAG Lifecycle
      After servers have been added to a DAG
      Configure the DAG
      Network Encryption
      Network Compression
      Configure DAG networks
      Network subnets
      Enable/disable MAPI traffic/replication
      Create mailbox database copies
      Seeding is performed automatically
      Monitor health and status of database copies
      Perform switchovers as needed
    • DAG Lifecycle
      Before you can remove a server from a DAG, you must first remove all replicated databases from the server
      When a server is removed from a DAG:
      The server is evicted from the cluster
      The cluster quorum is adjusted as needed
      The server is removed from the DAG object in Active Directory
      Before you can remove a DAG, you must first remove all servers from the DAG
    • Deploying Exchange 2010 HA Features
    • Deploying Exchange 2010 HA Features
    • Exchange 2010 Incremental Deployment
      Create a DAGNew-DatabaseAvailabilityGroup -Name DAG1 –WitnessServer EXHUB1 -WitnessDirectory C:DAG1FSW -DatabaseAvailablityGroupIpAddresses 10.0.0.8New-DatabaseAvailabilityGroup -Name DAG2 -DatabaseAvailablityGroupIpAddresses 10.0.0.8,192.168.0.8
      Add first Mailbox Server to DAGAdd-DatabaseAvailbilityGroupServer -Identity DAG1 -MailboxServer EXMBX1
      Add second and subsequent Mailbox ServerAdd-DatabaseAvailabilityGroupServer -Identity DAG1 -MailboxServer EXMBX2
      Add a Mailbox Database CopyAdd-MailboxDatabaseCopy -Identity MBXDB1 -MailboxServer EXMBX3
      Extend as needed
    • Transitioning to Exchange 2010 High Availability
    • Transition Steps
      Verify that you meet requirements for Exchange 2010
      Deploy Exchange 2010
      Use Exchange 2010 mailbox move features to migrate
      Unsupported Transitions
      In-place upgrade to Exchange 2010 from any previous version of Exchange
      Using database portability between Exchange 2010 and non-Exchange 2010 databases
      Backup and restore of earlier versions of Exchange databases on Exchange 2010
      Using continuous replication between Exchange 2010 and Exchange 2007
    • Exchange Server 2010 High Availability Design Examples
    • High Availability Design ExampleBranch/Small Office Design
      Hardware Load Balancer
      8 processor cores recommended with a maximum of 64GB RAM
      Member servers of DAG can host other server roles
      Client Access
      Hub Transport
      Mailbox
      Client AccessHub TransportMailbox
      DB1
      DB1
      UM role not recommended for co-location
      2-server DAGs should use RAID
      DB2
      DB2
      DB2
      DB3
      DB3
    • High Availability Design ExampleDouble Resilience – Maintenance + DB Failure
      2 servers out -> manual activation of server 3
      In 3 server DAG, quorum is lost
      DAGs with more servers sustain more failures – greater resiliency
      AD: Dublin
      Single Site
      3 Nodes
      3 HA Copies
      CAS NLB Farm
      JBOD -> 3 physical Copies
      X
      Mailbox
      Server 1
      Mailbox
      Server 2
      Mailbox
      Server 3
      X
      DB2
      DB1
      DB3
      DB2
      DB1
      DB3
      DB2
      DB1
      DB3
      DB4
      DB5
      DB6
      DB4
      DB5
      DB6
      DB5
      DB6
      DB4
      Database Availability Group
    • High Availability Design ExampleDouble Node/Disk Failure Resilience
      AD: Dublin
      • Single Site
      • 4 Nodes
      • 3 HA Copies
      • JBOD -> 3 physical Copies
      • Upgrade server 1
      • Server 2 fails
      • Server 1 upgrade is done
      • 2 active copies die
      CAS NLB Farm
      X
      Mailbox
      Server 1
      Mailbox
      Server 2
      Mailbox
      Server 3
      Mailbox
      Server 4
      X
      DB6
      DB4
      DB5
      DB3
      DB7
      DB5
      DB2
      DB1
      DB3
      DB8
      DB7
      DB1
      DB8
      DB1
      DB2
      DB6
      DB7
      DB8
      DB5
      DB4
      DB6
      DB2
      DB3
      DB4
      Database Availability Group (DAG)
    • High Availability on JBOD6 Servers, 3 Racks, 3 Copy DAG
      24,000 Mailboxes
      Heavy Profile: 100
      Messages/day
      .1 IOPS/Mailbox
      MAPI network
      2GB Mailbox Size
      8 Cores
      48 GB RAM
      8 Cores
      48 GB RAM
      Replication network
      4,000 Active Mbxs/Svr
      Mbx Server 1
      Mbx Server 2
      6 Servers, 3 Copies = double server failure resiliency
      DB1
      DB2
      DB3
      DB4
      DB5
      DB6
      DB46
      DB47
      DB48
      DB49
      DB50
      DB51
      DB52
      DB53
      DB31
      DB32
      DB54
      DB33
      4,000 Active Mbxs/Svr
      DB1
      DB7
      DB8
      DB9
      DB10
      DB11
      DB12
      DB55
      DB56
      DB57
      DB58
      DB59
      DB60
      DB61
      DB62
      DB34
      DB35
      DB63
      DB36
      1st failure: ~5,000 active
      DB13
      DB14
      DB15
      DB16
      DB17
      DB18
      DB64
      DB65
      DB66
      DB67
      DB68
      DB69
      DB70
      DB71
      DB37
      DB38
      DB72
      DB39
      2nd failure: 6,000 active
      DB1
      Soft active limit: 24
      DB19
      DB20
      DB21
      DB22
      DB23
      DB24
      DB73
      DB74
      DB75
      DB76
      DB77
      DB78
      DB79
      DB80
      DB40
      DB41
      DB81
      DB42
      DB25
      DB26
      DB27
      DB28
      DB29
      DB30
      DB82
      DB83
      DB84
      DB85
      DB86
      DB87
      DB88
      DB89
      DB43
      DB44
      DB90
      DB45
      1TB 7.2k SATA disks
      JBOD: 48 Disks/node
      Online Spares (3)
      Database Availability Group (DAG)
      288 disks total
      30 TB of db space
      Battery Backed
      Caching Array
      Controller
      Active copy
      Passive copy
      Spare Disk
      Legend
    • Key Takeaways
      Greater end-to-end availability with Mailbox Resiliency
      Unified framework for high availability and site resilience
      Faster and easier to deploy with Incremental Deployment
      Reduced TCO with core ESE architecture changes and more storage options
      Supports large mailboxes for less money
    • question & answer
    • Required Slide
      Speakers,
      TechEd 2009 is not producing
      a DVD. Please announce that
      attendees can access session
      recordings at TechEd Online.
      www.microsoft.com/teched
      Sessions On-Demand & Community
      www.microsoft.com/learning
      Microsoft Certification & Training Resources
      http://microsoft.com/technet
      Resources for IT Professionals
      http://microsoft.com/msdn
      Resources for Developers
      Resources
    • Complete an evaluation on CommNet and enter to win an Xbox 360 Elite!
    • Required Slide
      © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
      The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.