High Availability in Hurricane            Alley     Multi-site multi-node CAS     Deep in the Heart of Texas    Srinivas V...
Agenda1. Strategy2. Technical requirements3. Constraints4. Stuff at hand5. Architectural decisions6. Cluster & production ...
Strategic requirementsSingle IdentitySingle Sign On/ Single Sign OffMaximize self service toolsImproved user experience   ...
Technical requirements•   Application Compatibility•   High Availability•   Rolling maintenance•   Transparency•   Scalabi...
Constraints• Limited budget , use existing resources.  – Power in the datacenters  – Single internet  – High latency conne...
Stuff we had at hand•   SAN infrastructure with replication to DR•   VM clusters•   Site-to-site VPN based connectivity to...
Decisions ! Decisions ! Decisions !•   Virtual Machines•   SAN based storage•   The great ticket registry debate•   To rep...
Cluster components      Jasig Sakai Conference   8
Final Architecture     Jasig Sakai Conference   9
“Holy troubles, Batman!”• SSL offloading   – Tomcat offloading workaround• Authentication and Validation persistence   – U...
Routing logic• HTTP_REQUEST• HTTP_REQUST_DATA• HTTP_RESPONSE               Jasig Sakai Conference   11
HTTP_REQUEST(Request from the client)HTTP_REQUEST{1) Grab header length to determine payload size2) If both sites are down...
HTTP_REQUEST_DATA(Payload manipulation)HTTP_REQUEST_DATA{1) Grab <samlp:AssertionAtrifact> from payload , this may contain...
HTTP_RESPONSE(Response from the server)HTTP_RESPONSE{1) Grab server’s response headers2) If SiteID is not in the response ...
Jasig Sakai Conference   15
Experiences in Production•   Approx. 8 months in production•   7 Applications in production 10 in development•   Survived ...
Questions/Comments• Credits:  – CAS developers and community  – F5 & F5 devcentral  – Unicon  – LU & Txstate• Thank you fo...
Upcoming SlideShare
Loading in...5
×

Lamar University CAS HA

298

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
298
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Consolidate identities on campus and standardize on a single LDAP service that provides that identity.Provide Single Sign On (SSO) standard to the campus compatible with a vast majority of existing applications, (Zimbra email, Self Service Banner, Blackboard LMS, payment gateway system, degree audit system, Library systems etc.)Introducing a self-service mechanism for account management and include this into the login flow process.Provide capabilities for a light weight portal which can be replaced with a full portal in the future. 
  • Solution had to be compatible with majority of application. So, solution with open standards were most favorable.Solution had to highly available, resilient to both local disasters (datacenter  outage, LAN and WAN outages, application service failures) and regional disasters ( Hurricanes).No single points of failureAbility to do maintenance work on the system without outage (rolling maintenance).This availability has to transparent to the user to keep the SSO process seamless.System had to easily scalable with load (horizontal or vertical).Tie into pre-existing AD with Attributes release to applications. Branding to match university standard requirements for regular browser access and mobile browser access.Active-active, or active-failover delivery model with automated failover of services.
  • Had to use existing network resources as much as possible ( existing LAN , WAN equipment and connections)Limited in house developers and experience led to very good partnership with Unicon to provide installation and ongoing support for install baseDesire to keep as close to the release code as possible without heavy customization.Latency link between the datacenters.Short timeframe :)
  • SAN based infrastructure , with VM clusters.A replication of core/mission critical data to the DR site from SAN to SAN over a site-to-site VPN tunnelLoadbalancer infrastructure (LTM) that supported SSL offload to a wildcard certificate.Dedicated datacenter firewalls at both primary and DR datacenters.Opportunity ! 
  • VMs clusters provided protection from hardware failures and ability to scale horizontally via cloning.SAN storage provided storage redundancy and protection from hardware failures with offline backup capabilitySSL offloading to the loadbalancers allowed us to save money by using the wildcard certs.Choice of ticket registry was much debate, it was down to EHCache and Database ticket registry.Chose to horizontally scale it to protect against service failures.To eliminate database failures taking down CAS, we chose an ndb cluster which allow, in memory data with periodic writes to disk, built in replication between nodes, multiple access points to the same data allowing building of VM like &apos;appliances&apos; where each application instances
  • SSL offloading from Tomcat requires special setup in Apache + AJP , not a true solution but a workaround , &lt; insert configuration here&gt;To replicate tickets between sites or not  ?Maintain persistence within the site ( options, source address vsjsession id)Maintain persistence across sites ?Authentication and Validation can happen at different sites, enter site identifiers.Routing traffic based on site identifiers. Site identifiers are all over the place,  /serviceValidate is in the URI, /samlValidate its in the SAML payload,  Java CAS client is unique !! unique ID is not generated using the same algoritms. So had to add and remove site identifiers on the loadbalancers
  • Lamar University CAS HA

    1. 1. High Availability in Hurricane Alley Multi-site multi-node CAS Deep in the Heart of Texas Srinivas Varadaraj & Bill Thompson Jasig Sakai Conference 1
    2. 2. Agenda1. Strategy2. Technical requirements3. Constraints4. Stuff at hand5. Architectural decisions6. Cluster & production architecture7. Challenges and solutions8. Multi-site routing9. Production experiences10. Questions & Comments Jasig Sakai Conference 2
    3. 3. Strategic requirementsSingle IdentitySingle Sign On/ Single Sign OffMaximize self service toolsImproved user experience Jasig Sakai Conference 3
    4. 4. Technical requirements• Application Compatibility• High Availability• Rolling maintenance• Transparency• Scalability• AD integration• Customization(branding) Jasig Sakai Conference 4
    5. 5. Constraints• Limited budget , use existing resources. – Power in the datacenters – Single internet – High latency connectivity• Limited in-house development & experience – Stay close to release code• Aggressive timeframe Jasig Sakai Conference 5
    6. 6. Stuff we had at hand• SAN infrastructure with replication to DR• VM clusters• Site-to-site VPN based connectivity to DR• F5 loadbalancers• Dedicated firewalls• Opportunity Jasig Sakai Conference 6
    7. 7. Decisions ! Decisions ! Decisions !• Virtual Machines• SAN based storage• The great ticket registry debate• To replicate tickets or NOT !• Building by cloning• “Appliance” like• SSL Local vs Offloading• Cluster VS Standalone application servers• Timeout ! Jasig Sakai Conference 7
    8. 8. Cluster components Jasig Sakai Conference 8
    9. 9. Final Architecture Jasig Sakai Conference 9
    10. 10. “Holy troubles, Batman!”• SSL offloading – Tomcat offloading workaround• Authentication and Validation persistence – User and application can go to either site. – Enter site identifiers• Multi-site ticket replication. – Latency in WAN• Algorithm usage in phpCAS clients and Java CAS clients• Slow performance of mod_auth_cas on VMs Jasig Sakai Conference 10
    11. 11. Routing logic• HTTP_REQUEST• HTTP_REQUST_DATA• HTTP_RESPONSE Jasig Sakai Conference 11
    12. 12. HTTP_REQUEST(Request from the client)HTTP_REQUEST{1) Grab header length to determine payload size2) If both sites are down, redirect to a branded service unavailable page3) If URI has siteID of other site and other site isup, route to other site4) Otherwise default route to local site} Jasig Sakai Conference 12
    13. 13. HTTP_REQUEST_DATA(Payload manipulation)HTTP_REQUEST_DATA{1) Grab <samlp:AssertionAtrifact> from payload , this may contain siteID2)if we have a siteID of the other side { If the siteID is Loadbalancer introduced { blank the loadbalancer extension} Route to other sideelse { if we have a siteID of the local side { If the siteID is Loadbalancer introduced { blank the loadbalancer extension} Route to local side }} Jasig Sakai Conference 13
    14. 14. HTTP_RESPONSE(Response from the server)HTTP_RESPONSE{1) Grab server’s response headers2) If SiteID is not in the response header { Introduce a loadbalancer siteID tocompensate for java CAS client}Release HTTP to client} Jasig Sakai Conference 14
    15. 15. Jasig Sakai Conference 15
    16. 16. Experiences in Production• Approx. 8 months in production• 7 Applications in production 10 in development• Survived two power outages at DR• Survived multiple internet outages• Successful rolling upgrades to MySQL & CAS• Flow based redesign.• LPPE• Re-visit ticket registry Jasig Sakai Conference 16
    17. 17. Questions/Comments• Credits: – CAS developers and community – F5 & F5 devcentral – Unicon – LU & Txstate• Thank you for your time !!• Contacts: – Sri: Sri@lamar.edu – Bill: wgthom@unicon.net Jasig Sakai Conference 17

    ×