Stacking up with OpenStack: building for High Availability
Upcoming SlideShare
Loading in...5
×
 

Stacking up with OpenStack: building for High Availability

on

  • 2,739 views

 

Statistics

Views

Total Views
2,739
Views on SlideShare
2,739
Embed Views
0

Actions

Likes
5
Downloads
90
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Stacking up with OpenStack: building for High Availability Stacking up with OpenStack: building for High Availability Presentation Transcript

  • Stacking up with OpenStack:Building for High AvailabilityUtpal Thakrar, Sr. Product ManagerApril 17, 2013
  • 2#My relationship with HA 1975 Cloud Management #rightscale
  • 3#My relationship with HA 1991 Cloud Management #rightscale
  • 4#My relationship with HA 2001 How many 9-s can your product do? Cloud Management #rightscale
  • 5#So what did they mean by 5-9s? Availability Allowed Down Time each Year 99% 3.65 days 99.9% 8.76 hours 99.99% 52.56 minutes 99.999% 5.26 minutes Cloud Management #rightscale
  • 6#Stuff happens, are you prepared? Cloud Management #rightscale
  • 7#Who dunnit?… Cloud Management #rightscale
  • 8#And you see these … Cloud Management #rightscale
  • 9#Is 100% Outage-proofing possible? Cloud Management #rightscale
  • 10#Old School Fault-Tolerance: Build Two Cloud Management #rightscale
  • 11#Golden Age of Cloud Computing No Up-Front Low Cost Pay Only for Capital Expense What You Use Self-Service Easily Scale Up Improve Agility & Infrastructure and Down Time-to-Market Deploy Cloud Management #rightscale
  • 12#Golden Age for Fault-Tolerance No Up-Front HA Low Cost Pay for DR Only Capital Expense Backups When You Use it Self-Service Easily Deliver Fault- Improve Agility & DR Infrastructure Tolerant Applications Time-to-Recovery Deploy Cloud Management #rightscale
  • 13#Yeah, but …What about my private cloud?Applications deployed in private clouds have to worry about:• Private Cloud Infrastructure being HA• Application architecture HA / DR• With Public Clouds – Well, you get what your provider gives you Cloud Management #rightscale
  • 14#Private Cloud Infrastructure HASeveral single points of failure in OpenStack deployment• OpenStack API services• MySQL• RabbitMQSolved in various ways• Pacemaker cluster management• Keepalived (e.g: RAX Private Cloud)• MySQL (Galera), RabbitMQ (active-active mirrored queues) Eliminate SPoFs as best as you can. Cloud Management #rightscale
  • 15#What about my app?Design for failure:• If your application relies on Cloud infrastructure SLA for its HA needs, you are STUCK with that vendor / infrastructure• Need to balance cost and complexity against risk tolerance• Design application so that its:  Build for server failure  Build for zone failure  Build for cloud failure  Keep management layer separate from infrastructure Cloud Management #rightscale
  • 16#Build for Server Failure• Set up auto-scaling• Set up database mirroring, master/slave configuration• Use static public IPs• Use Dynamic DNS for private IPs Cloud Management #rightscale
  • 17# Build for Zone Failure Static Public IPs DNS 172.168.7.31 172.168.8.62 Zone 1 Zone 2 1 LOAD BALANCERS LOAD BALANCERS Where possible, use NoSQL DB like Cassandra or MongoDB APP SERVERS AUTOSCALE MASTER DB SLAVE DB REPLICATE Block SNAPSHOTS Object storeSnapshot data volume for backups so Place Slave databases in onethe database can be readily recovered or more zones for failover. within the region. A creative deployment model would be to make your private cloud an “AZ” by placing it in close physical proximity to a public cloud provider Cloud Management #rightscale
  • 18#Build for Cloud Failure (Cold DR)Staged Server Configuration and generally no staged data $• Not recommended if rapid recovery is required• Slow to replicate data to other cloud and bring database online DNS 172.168.7.31 Private DALLAS LOAD BALANCERS LOAD BALANCERS APP SERVERS APP SERVERS MASTER DB SLAVE DB SLAVE DB REPLICATE Block SNAPSHOTS CLOUD Cloud Management FILES #rightscale
  • 19#Build for Cloud Failure (Warm DR)Staged Server Configuration, pre-staged data and running Slave Database Server $$• Generally recommended DR solution• Minimal additional cost and allows fairly rapid recovery DNS 172.168.7.31 Private DALLAS LOAD BALANCERS LOAD BALANCERS APP SERVERS APP SERVERS MASTER DB SLAVE DB SLAVE DB REPLICATE REPLICATE Block SNAPSHOTS SNAPSHOTS CLOUD Cloud Management FILES #rightscale
  • 20#Build for Cloud Failure (Hot DR)Parallel Deployment with all servers running but all traffic going to primary $$$• Not recommended• Very high additional cost to allow rapid recovery DNS 172.168.7.31 Private DALLAS LOAD BALANCERS LOAD BALANCERS APP SERVERS APP SERVERS MASTER DB SLAVE DB SLAVE DB REPLICATE REPLICATE Block SNAPSHOTS SNAPSHOTS CLOUD Cloud Management FILES #rightscale
  • 21#Availability vs. Cost - Dial Cost Availability Min Min Max Max Cloud Management #rightscale
  • 22#Make sure workload is portable across clouds Cloud Management #rightscale
  • 23#Automate and test everything• Automate backups of your data• Setup monitoring and alerts• Run fire-drills! Plan and Practice your recovery procedures! Cloud Management #rightscale
  • 24#Separate Management layer from Infrastructure• Keep the keys to the car outside the car Cloud Management #rightscale
  • 25#Automating HA and DR• Use dynamic DNS for your database servers • Allow app servers to use a single FQDN. • Use a low TTL to allow rapid failover in the case of a change in master database• Automatic connection of app servers to load balancing servers • App servers can connect to all load balancers automatically at launch • No manual intervention • No DNS modifications• Automated promotion of slave to master • Process is automated • Decision to run process is manual Cloud Management #rightscale
  • Samsung SDS Mr. Kirk KimCopyright © 2013 Samsung SDS Co., Ltd. All rights reserved
  • Hybrid Cloud Network Architecture Internet traffic CF Router Public ASN: XXXX Firewall IPS VPN Gateway Compute EIP: e.x.y.b EIP: e.x.y.a VM VM Private Network VM VM VPC Virtual GW Private: 10.x.x.x/24 Private: 10.x.x.x/24 VM VM Public: *.*.*.0/24 Public: *.*.*.0/24 Internet GW 10.x.x.x/24 Object Storage SPCS Public Cloud Between SPCS and Public Cloud using public IP Between SPCS and Public Cloud using private IP Internet traffic to SPCS and Public Cloud using public IP Copyright © 2013 Samsung SDS Co., Ltd. All rights reserved27
  • 28#How RightScale makes it possibleRightScale ServerTemplates™• Reproducible: Predictable deployment• Dynamic: Configuration from scripts at boot time• Multi-cloud: Cloud agnostic and portable• Modular: Role and behavior abstracted from cloud infrastructure Cloud Management #rightscale
  • 29#How RightScale makes it possibleMultiCloud Images• MultiCloud Images can be launched across regions and clouds without modification ServerTemplate contains a list 1 of MultiCloud Images (MCIs) When the Server is 2 created, a specific MCI is chosen. The appropriate 3 RightImage is used at MultiCloud Images launch. Cloud A, B, Image 1 Cloud A C, Image 2 Cloud B, Image 1 Cloud A, B, Image 1 Cloud B Stability across clouds Image 1 RightImage Cloud Management #rightscale
  • 30#Outage-Proofing Best Practices Place in >1 Replicate data Replicate data zone: across zones across zones • Load balancers  Backup across Design stateless • App servers regions & clouds apps for • Databases  Monitoring, alert, resilience to Maintain and automate reboot / relaunch capacity to operations to absorb zone or speed up region failures failover Cloud Management #rightscale
  • 31#Thank you!Sign-up for a free account at: www.rightscale.comCheck out job postings are: www.rightscale.com/jobs We are hiring! Cloud Management #rightscale