Learning to Scale OpenStack

1,203 views

Published on

Learning to Scale Openstack: A Case Study in Rackspace's Open Cloud Deployment was presented at OpenStack Design Summit in Portland, OR on April 17, 2013. Watch the recording of the presentation on youtube at the following link: http://www.youtube.com/watch?v=3x8X6f5mnzc

Published in: Technology, Education
  • Be the first to comment

Learning to Scale OpenStack

  1. 1. Rainya Mosher, Dev Manager, Deploy Infrastructure IRC: rainya on freenode Twitter: @rainyamosher Learning to Scale OpenStack: A Case Study in Rackspace's Open Cloud Deployment April 17, 2013 at 4:30pm
  2. 2. RACKSPACE® HOSTING | WWW.RACKSPACE.COM It is not the critic who counts; not the man who points out how the strong man stumbles, or where the doer of deeds could have done them better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; . . . who at best knows in the end the triumph of high achievement, and who at worst, if he fails, at least fails while daring greatly. Theodore Roosevelt The Man in the Arena, April 1910 2 In the Arena Learning to Scale OpenStack
  3. 3. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Hundreds of HVs Thousands of HVs Tens of Thousand HVs Hundreds of Thousand HVs Global Cloud Region Region Cell Cell Cell HV HV HV HV HV HV Cell Cell Region 3 What does “At Scale” Mean? Learning to Scale OpenStack
  4. 4. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Code Package Deploy Verify 4 What is the Control Plane Release Strategy? Learning to Scale OpenStack
  5. 5. RACKSPACE® HOSTING | WWW.RACKSPACE.COM First Scaling Hurdle – Deploy Mechanism Learning to Scale OpenStack 5 • Aug 2012 – Rackspace launches Open Cloud – Frequent releases to fine tune • Sep 2012 thru Nov 2012 – Deploying code that is two weeks from trunk takes about two hours – Begin designing new deploy mechanism at October Summit • Dec 2012 – Code deploys take 4 - 6 hours – Deploy team says, bleary-eyed, they aren’t doing it again • Jan 2012 – Deploy again – Takes more than 6 hours – Accept that it is no longer “reasonable” and temporarily stop deploying code releases – Focus on the deploy mechanism 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 Aug-12 Sep-12 Oct-12 Nov-12 Dec-12 Jan-13 Feb-13 Internal Code Releases Capacity Linear (Internal Code Releases)
  6. 6. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • switched from Debian packages to virtual environments Package • used torrent for package, pssh for fact files, and mcollective for actions Distribute • moved centralized puppet master to decentralized masterless puppet Execute 6 Improving the Deploy Mechanism Deploying from OpenStack Trunk
  7. 7. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Second Scaling Hurdle – Catch up to Trunk Learning to Scale OpenStack 7 • March 2013 – Production code is 2 months behind trunk – Trunk as of 2/28 becomes our “v152” and bakes in preprod – Prep for impacting DB migrations in production – Re-enable our CI process • April 2013 – Deploy v152 to production – 10x increase in DB traffic – Community works to fix – Re-deploy v152 with Community fixes – Attend Summit in Portland and share the story 1 2 3 4 1 – Normal DB throughput ; 2 – First installation of v152; 3 – Disabled several periodic tasks; 4 – Re-installed v152 with patches from Community & turned periodic tasks back on
  8. 8. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • Testing & Environments – More robust testing coverage – Deployer-specific testing further upstream – Production-like dev environments – Simulate production compute numbers on non-production hardware • Database & Code Management – Non-disruptive DB migration patterns – DB calls with 6 million rows in mind, not just 60 – Code optimization paths for large datasets • Process & Community – Stay close to trunk, even though it is hard – Explore options for a continuously deployable trunk How Can We Adapt for Scale Issues? Learning to Scale OpenStack 8
  9. 9. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Backup Slides Learning to Scale OpenStack 9 Many of these backup slides were first presented on 4/16/2013 during the OpenStack Summit session “Deploying from OpenStack Trunk” and are included here for reference.
  10. 10. RACKSPACE® HOSTING | WWW.RACKSPACE.COM 10 Merge and Branch Strategy Learning to Scale OpenStack • The most recent Rackspace release branch took over 50 minor tags make to work in production • Rackspace Development branch is about 40 patches on top of OpenStack trunk for internal service compatability
  11. 11. RACKSPACE® HOSTING | WWW.RACKSPACE.COM • per-project venv • .tar of project venvs + configs Package • seed .torrent • distribute fact files • verify completion Distribute • switch version • sync databases • run puppet • verify completion Execute 11 Package and Distribute Strategy Learning to Scale OpenStack
  12. 12. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Deploy and Test Strategy Learning to Scale OpenStack • pre-code check-in validation Dev • smoke tests • unit tests Integration • functional tests • integration tests QA • regression tests • build tests Pre-Prod • smoke tests • build tests Production
  13. 13. RACKSPACE® HOSTING | WWW.RACKSPACE.COM Benefits and Challenges of Trunk Deploys Learning to Scale OpenStack 13 Why We Do It (Benefits) • Issue Resolution – Early detection of issues and conflicts – Shorter feedback loop within the community – Faster resolution of issues • Early Feature Delivery – Smaller, incremental periodic releases – More stable release candidates at end of cycle Why It’s Hard (Challenges) • Code Management – Merge conflicts with local patches – Disruptive DB migrations – Service restarts – Temporary version skew • Testing – Devstack-based testing vs testing at scale – Rework when issues found in RAX deploy pipeline • Process – CI/CD vs Release methodology – Time to merge patches
  14. 14. RACKSPACE® HOSTING | WWW.RACKSPACE.COM 14 Scale of Deploy Pipeline Learning to Scale OpenStack 1,000s of Nodes100s of Nodes10s of NodesDevStack Dev Integration & QA PreProd Production
  15. 15. 15 RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218 US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM

×