• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
High availability lync server 2010
 

High availability lync server 2010

on

  • 2,045 views

Presentacion de Alta Disponibilidad de Lync Server 2010

Presentacion de Alta Disponibilidad de Lync Server 2010

Statistics

Views

Total Views
2,045
Views on SlideShare
2,045
Embed Views
0

Actions

Likes
0
Downloads
39
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Slide Objective: In the following section explain briefly what high availability means to us, and what the major challenges have been during the last versions of OCS.Notes:
  • Slide Objective: Explain HA Architecture from W14 legacy systems in few words.Notes:Also for a clear understanding of the new system, highlight that Bob’s OC and his Phone is registered on different Front End Server. It is possible that they connect to a different front end server. In CS “14”, this behavior is no longer possible. Highlight also that a granular separation of the services (registration, routing, presence, and conferencing) is not possible with OCS 2007 & OCS 2007 R2.Also highlight that a pool with multiple front end servers always requires a Hardwareloadbalancer. Remind students that with OCS 2007 R2 only one instance of SQL server was used within one pool.
  • Slide Objective: Explain the CS “14” HA processNotes:Start to explain, that with in CS “14” every user logs on against a predefined frontend server within one pool. (briefly describe the hash algorithm procedure to generate a log on sequence per SIP UIR). Highlight clearly that bob is connected to ONE front end server, regardless of how many sip endpoints he uses in parallel. Looking for details? -> nexthop.info (Introducing DNS Load Balancing in CS 14)Explain that every server within CS 14 has its own SQL express database which is used for several purposes in this example for registration & routing.HLB Is Optional and is NOT Recommended for SIP traffic. DNS LB is recommended to make HLB configuration easier (Web only). In order to reduce failed call incidents due to HLB misconfiguration, it is desirable to get the HLB out of the main SIP routing path HLB is advised for SIP traffic only for the scenario where a customer plans to be in co-existence (OCS 2007 & CS “14”, OCS 2007 R2 & CS “14”) for a large amount of time (ex: six months+)
  • Slide Objective: Discuss architecture within CS “14”Notes:Start with the explanation that within CS “14” every sip account has its primary and secondary (backup) registrar pool. With this feature a sip account is able to log on against various pools, the example above shows that a user (bob) can also register against the SBA in the branch office. Make sure that students understand that presence and conferencing info from pool 1 (data center) are not transferred to pool 2 (data center 2). In case of a failure from pool 1 users are able to log on against backup pool (2) but not all features will be available when they are signed in on their backup pool. A diagram with details will be shown later in this presentation.What does the branch office user count?Branch User’s Primary Registrar Pool = Survivable Branch Appliance (SBA) Backup Registrar Pool = Data Center CS PoolBranch Users always register with the SBA Registrar (Primary) unless it is unavailable
  • Slide Objective:Notes:
  • Slide Objective: Discuss failoverNotes:In this case, CS is NOT split across two data centers, but instead, there are two separate CS installations. Users can fail over from one to the other, but functionality is lost at failover if it relies on information stored in the CS backend. This includes the vast majority of “unavailable features” – user call forwarding settings, etc., can only be restored by manually copying those settings from a backup into the backend in the second pool. Voicemail issues here are the same as in the previous slide.Two modes for failover and failbackAutomatic: backup connection after configurable intervalManual: administrator switch enables connection (manual capability avoids failover on transient)What happens if primary datacenter cannot be restored?Restore Central management Server in backup datacenterRestore other services including Presence, Conferencing by “moving” users to other Pool
  • Slide Objective: Discuss failoverNotes:In this case, CS is NOT split across two data centers, but instead, there are two separate CS installations. Users can fail over from one to the other, but functionality is lost at failover if it relies on information stored in the CS backend. This includes the vast majority of “unavailable features” – user call forwarding settings, etc., can only be restored by manually copying those settings from a backup into the backend in the second pool. Voicemail issues here are the same as in the previous slide.Two modes for failover and failbackAutomatic: backup connection after configurable intervalManual: administrator switch enables connection (manual capability avoids failover on transient
  • Slide Objective: Discuss failoverNotes:1.As the same within OCS 2007 & OCS 2007R2 the client queries DNS Srv to provide a CS 14 Pool FQDN (in this case a Director Server)DNS server the returns a Director Pool FQDN (he can also return multiple addresses for DNS load balancing purposes)2. A TLS sip register request is sent to the director server.The server returns a 401 certificate challenge, (ensure that audience understands that this certificate request is different from an AD cert authority)CS “14” provides an “own” CA which is only used for authentication purposes3. The client connect the CS “14” certificate service with its windows credentials.4. the pool creates a certificate and returns it to the client as well as the server. (its more like a token which is issued from CS “14” to the client). It can only be used within CS. 5. With the issued certificate the client tries again to register against the pool. In this case the director returns a 301 redirect message to redirect the server to the Pool6. If the primary pool becomes unavailable the client automatically connects to the backup registrar.
  • Slide Objective:Notes:
  • Slide Objective: Explain how CS “14” works when split over two data centersNotes:In this case, CS is effectively split across two data centers – if one data center is lost, the user fails over to the second and still has almost all features, much like the case where a single server is lost in an R2 CS pool.Ability to leave voicemail for the user is lost if the user’s DID number terminates in the failed data center and the associated gateways or SIP trunk connections are lost. This is expected to be a likely case.Response group service is an app running on local pool.Further talking points:If you plan to deploy this CS architecture you need to ensure that you have a geo cluster for SQL Server (this is necessary for the functionality of the SQL Server not the CS Server).The entire pool works as a logical unit.
  • Slide Objective: Explain how CS “14” works when split over two data centersNotes:Once again a more detailed flow:CS client queries a local DNS Server (if necessary) to get the poolfqdn (remember this can also happen via Clientcache or DCHP option 120)Afterwards the client connects to its primary front end server. In the event of a failover in datacenter NYThe SQL Server cluster initiates a failover (SQL Server in the backup datacenter becomes the active SQL Server)the client automatically tries to connect to the next server in the generated log on list for the specific SIP URI. This happens till the client is able to log on to the next available server. What happens to the client:The client should sign out and sign in in a short amount of time (media conversation should not break)Why are two options marked as not working in the deck?Ability to leave voicemail for the user is lost if the user’s DID number terminates in the failed data center and the associated gateways or SIP trunk connections are lost. This is expected to be a likely case.

High availability lync server 2010 High availability lync server 2010 Presentation Transcript

  • High Availability
    1
  • High Availability in OCS 2007 / 2007 R2
    Office Communications Server (OCS) 2007 and R2
    Registration
    Routing
    Presence
    Conferencing
    HLB required for all traffic
    Bob’s OC
    Bob’s Phone
    Architecture:
    • One monolithic Front End Service
    • Dependency on single shared backend database (Registration, Routing, Presence, Conferencing)
    2
  • High Availability – Communications Server “14”
    Microsoft Communications Server “14”
    User Services Database
    (Presence and Conferencing)
    Architecture:
    • Registrar Role (Registration and Routing). Each registrar has its own SQL Express database
    • User Services Role (Presence and Conferencing)
    • Registrar and User Services are collocated in the datacenter (but on different servers)
    • All user end points register with same Front End
    • Users are load balanced by Registrars using a Distributed Hash Algorithm
    • Registrar can be installed in remote locations
    Registrar
    Database
    (Registration and
    Routing)
    HLB is optional for SIP traffic
    (DNS LB is recommended)
    HLB still required for client-server
    HTTP Traffic
    Bob’s OC
    Bob’s Phone
    3
  • Resiliency Architecture
    4
    Branch Office
    Registrar
    Data Center - EE Pool 1
    Presence
    Conferencing
    SBA
    Backup
    Registrar
    Pool
    AD & DNS
    Joe’s Primary Registrar = SBA., User Services = EE Pool1
    Registrar
    (Registration
    & Routing)
    Data Center - EE Pool 2
    Backup
    Registrar
    Pool
    Presence
    Conferencing
    AD & DNS
    Registrar
    (Registration
    & Routing)
    Alice’s Primary Registrar
    & User Services = EE Pool 2
    Bob’s Primary Registrar
    & User Services = EE Pool 1
    Architecture:
    Each user has a “Primary Registrar Pool”. Each Registrar Pool can have a “Backup Registrar Pool”
    User’s client discovers a Registrar Pool through DNS SRV. Directed to “Primary & Backup Registrar Pool”
    Backup Registrar heart-beats Primary Registrar. If heart-beat not received within Configurable Failover Interval (default = 120 sec for branch offices), Backup starts accepting client registrations
  • Data Center Voice Resiliency
    5
  • Data Center Voice Resiliency (EE)Failover to Backup Data Center
    North America Data Center
    Europe Data Center
    Backup
    Registrar
    CS “14”Edge2
    CS “14” Pool 2
    CS “14”Edge1
    CS “14”Pool 1
    Failover
    WAN
    • Communications Server “14” Pool. That Communications Server “14” Pool directs client to primary and backup SIP registrar
    • Client attempts connect to Primary Registrar Pool, if fails, connects to Backup
    • Limited feature set available on failover
    • Enable/Disable Automatic failover, Configurable Failover interval
    • Automatic Failback, Configurable Failback interval (No manual failback. Workaround: Stop Front End Services on Primary Registrar pool servers)
    • What happens if Primary Data Center cannot be restored?
    6
  • Data Center Voice Resiliency (SE)Failover to Backup Data Center
    North America Data Center
    Europe Data Center
    Backup
    Registrar
    CS “14”Edge2
    CS “14” SE 2
    CS “14”Edge1
    CS “14” SE 1
    Failover
    WAN
    WAN
    • SE Servers operate as separate systems
    • Client DNS SRV request discovers (one or multiple) Communications Server “14” SE. That Communications Server “14” SE sever directs client to primary and backup SIP registrar
    • Client attempts connect to Primary Registrar, if fails, connects to Backup
    • Limited feature set available on failover
    • Enable/Disable Automatic failover, Configurable Failover interval
    • Automatic Failback, Configurable Failback interval (No manual failback. Workaround: Stop Front End Services on Primary Registrar servers)
    • If Primary Data Center cannot be restored:
    • Restore Central management Server in backup datacenter
    • Restore other services including Presence, Conferencing by “moving” users to other Pool
    7
    7
  • Data Center Voice ResiliencyFailover to Backup Data Center (Discovery)
    North America Data Center
    Europe Data Center
    Backup
    Registrar
    CS “14”Edge1
    CS “14”Edge2
    CS “14”Pool 2
    CS “14” DirectorPool
    CS “14” Pool 1
    AD DS & DNS
    (6)
    (5)
    (4)
    (3)
    (2)
    (1)
    WAN
    Client DNS SRV request. Example: DNS SRV for _sipinternaltls._tcp.contoso.com
    DNS SRV Response includes
    • CS Director Pool.contoso.com:5061 Priority=0, Weight=10
    • CSPool2.contoso.com:5061 Priority=1 , Weight=10
    Client connects via TLS to Communications Server “14” Director Pool. Sends SIP Register. Authenticates.
    Communications Server “14” Director Pool redirects client. SIP 301 includes Primary & Backup Registrar pool
    If Primary Registrar Pool is available, client connects and registers with it
    Else client connects and registers with Backup Registrar Pool (CS Pool 2)
    8
  • Metropolitan Data Center Resiliency
    9
  • Metropolitan Data Center ResiliencyCS “14” Pool Extended Across Two Data Centers
    NY Data Center
    NJ Data Center
    Passive SQL
    Active SQL
    CS “14”Edge
    FE 3-4
    FE 1-2
    CS “14”Edge
    Low-Latency
    WAN
    • Communications Server “14” pools operate as one logical system
    • Split Front End pool across two datacenters (all FEs active)
    • SQL Geo cluster for backend (Stretched Virtual Local Area Network (VLAN))
    • Data replication is done by storage arrays (Ex: EMC SRDF, HP CLX EVA)
    • Requires low latency WAN (15 milliseconds)
    • In one site is down, clients are serviced by FEs in other site
    • Nearly all features available
    • PSTN termination may affect inbound calls
    • Failback has to be manually initiated
    10
  • Metropolitan Data Center ResiliencyCS “14” Pool Extended Across Two Data Centers
    NY Data Center
    NJ Data Center
    Passive SQL
    Active SQL
    CS “14”Edge
    CS “14”Edge
    FE 1-2
    FE 3-4
    Low-Latency
    WAN
    11
    DNS Srv
    DNS Server
    Pool.contoso.com