1
April 16, 2014
Ops and Chefs: Lessons learned while working on
OpenStack deployment cookbooks
#chefconf 2014
RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Who Are We?
Justin Shepherd
Principal Architect
Rackspace Private Cloud
github.com/galstrom21
Joseph Breu
Software Dev Team Lead US
Rackspace Private Cloud
github.com/rackerjoe
@rackerjoe
3RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Support Team
•Manages and Operates OpenStack clouds for our customers
•Troubleshoots any issues a customer might encounter with their cloud
Dev Team
•Creates tools for deploying and configuring OpenStack clouds
•Makes initial decision on when a new project is “stable” enough to
enable in the customer clouds
A little context
4RACKSPACE® HOSTING | WWW.RACKSPACE.COM
In the beginning….
The Dev Team were experts at deployments
Best Practices from the Ops Team
5RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Over time….
The Ops team took the reigns and became the experts.
New expertise breeds new Best Practices!
Best Practices from the Ops Team
6RACKSPACE® HOSTING | WWW.RACKSPACE.COM
In the end….
How do you codify this new knowledge?
Best Practices from the Ops Team
7RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Nova
• Has 1000+ configuration options
• has support for 4 hypervisor platforms
Neutron
• has support for ~15 network platforms
Cinder
• has support for ~20 storage platform
Flexibility vs. Standardization
8RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Limit platform options based on current expertise
• We chose KVM because it had the best support in the
community, and we have lots of Linux expertise in house
• We chose NetApp initially and leveraged our existing storage
team
Add more options as needed based on demand
• We need compelling reasons to add features
Each addition increases QE test matrices exponentialy!
Flexibility vs. Standardization
9RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Even limiting scope it takes 61,480 lines of cookbook to implement a
working OpenStack cluster
Flexibility vs. Standardization
10RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Issues with cookbooks
• Setting an incorrect value in nova‟s config file
• Absence of a useful attribute
• Cookbook dependencies and upstream breakage (COOK-2676)
Issues in the upstream OpenStack projects
• Hopefully these are known issues, and the community is addressing them
Issues in the upstream OS packages
• Vendor X breaks a package (i.e. kernel and openvswitch)-where X is all of them!
Issues in foundation technologies
• MySQL
• RabbitMQ
Issue Tracking
11RACKSPACE® HOSTING | WWW.RACKSPACE.COM
The usual suspects
• Ubuntu LTS
• RHEL
• CentOS
The usual problems
• Independent packaging process
• Separate bug tracking process
• Packages are cut at different points in time
Multiple Base Operating Systems
12RACKSPACE® HOSTING | WWW.RACKSPACE.COM
OpenStack releases a new “stable” version every 6 months.
• There are also 3 scheduled minor version releases for the stable
version during the next 6 month cycle.
• Although none of the upstream packages match these releases
Deciding on your Release Schedule
13RACKSPACE® HOSTING | WWW.RACKSPACE.COM
We support the current “stable” version and one prior
• So our customers do not have to update every 6 months
• Upgrades are currently disruptive
We currently release minor version updates every other month
• Includes cookbook bug fixes
• Includes latest available packages
Deciding on your Release Schedule
14RACKSPACE® HOSTING | WWW.RACKSPACE.COM
This is probably one of the easiest decision to make and the
hardest to get right
• We have changed our model multiple times in the last 3 years
• We will probably change it again 
Go ask @claco Thursday 2:50pm Regency A/B about this
Deciding on your Release Schedule
15RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Major changes can occur every 6 months
• Package renames
• Service renames
• New/Deprecated config options
Services get split into their own projects
• Nova-Volumes -> Cinder
• Nova-Network -> Neutron (kind of)
• Nova-Scheduler -> Gantt
Managing Chef deployed OpenStack
16RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Inter-release upgrades
• Sweeping attribute key renames
• node[„quantum‟] -> node[„neutron‟]
• Have to remap customers environments
• https://github.com/rcbops/mungerator
• New BUGS!
• That never happens.. it is a new version.. it must be a feature
OS Upgrades
• Terrible, terrible, and more terrible
Managing Chef deployed OpenStack
17RACKSPACE® HOSTING | WWW.RACKSPACE.COM
Jenkins is used for testing and gating of our cookbooks
• Syntax Verification (pep8, foodcritic)
• Full deployment of OpenStack utilizing the cookbooks with the
proposed changeset applied
• Functional Tests of OpenStack
• OpenStack API Testing
• OpenStack CLI Testing
Upstream opscode cookbooks are tested before inclusion
Testing
1818
RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218
US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM
RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM
QUESTIONS?
19
19
RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218
US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM
RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM

Rackspace Private Cloud presentation for ChefConf 2014

  • 1.
    1 April 16, 2014 Opsand Chefs: Lessons learned while working on OpenStack deployment cookbooks #chefconf 2014
  • 2.
    RACKSPACE® HOSTING |WWW.RACKSPACE.COM Who Are We? Justin Shepherd Principal Architect Rackspace Private Cloud github.com/galstrom21 Joseph Breu Software Dev Team Lead US Rackspace Private Cloud github.com/rackerjoe @rackerjoe
  • 3.
    3RACKSPACE® HOSTING |WWW.RACKSPACE.COM Support Team •Manages and Operates OpenStack clouds for our customers •Troubleshoots any issues a customer might encounter with their cloud Dev Team •Creates tools for deploying and configuring OpenStack clouds •Makes initial decision on when a new project is “stable” enough to enable in the customer clouds A little context
  • 4.
    4RACKSPACE® HOSTING |WWW.RACKSPACE.COM In the beginning…. The Dev Team were experts at deployments Best Practices from the Ops Team
  • 5.
    5RACKSPACE® HOSTING |WWW.RACKSPACE.COM Over time…. The Ops team took the reigns and became the experts. New expertise breeds new Best Practices! Best Practices from the Ops Team
  • 6.
    6RACKSPACE® HOSTING |WWW.RACKSPACE.COM In the end…. How do you codify this new knowledge? Best Practices from the Ops Team
  • 7.
    7RACKSPACE® HOSTING |WWW.RACKSPACE.COM Nova • Has 1000+ configuration options • has support for 4 hypervisor platforms Neutron • has support for ~15 network platforms Cinder • has support for ~20 storage platform Flexibility vs. Standardization
  • 8.
    8RACKSPACE® HOSTING |WWW.RACKSPACE.COM Limit platform options based on current expertise • We chose KVM because it had the best support in the community, and we have lots of Linux expertise in house • We chose NetApp initially and leveraged our existing storage team Add more options as needed based on demand • We need compelling reasons to add features Each addition increases QE test matrices exponentialy! Flexibility vs. Standardization
  • 9.
    9RACKSPACE® HOSTING |WWW.RACKSPACE.COM Even limiting scope it takes 61,480 lines of cookbook to implement a working OpenStack cluster Flexibility vs. Standardization
  • 10.
    10RACKSPACE® HOSTING |WWW.RACKSPACE.COM Issues with cookbooks • Setting an incorrect value in nova‟s config file • Absence of a useful attribute • Cookbook dependencies and upstream breakage (COOK-2676) Issues in the upstream OpenStack projects • Hopefully these are known issues, and the community is addressing them Issues in the upstream OS packages • Vendor X breaks a package (i.e. kernel and openvswitch)-where X is all of them! Issues in foundation technologies • MySQL • RabbitMQ Issue Tracking
  • 11.
    11RACKSPACE® HOSTING |WWW.RACKSPACE.COM The usual suspects • Ubuntu LTS • RHEL • CentOS The usual problems • Independent packaging process • Separate bug tracking process • Packages are cut at different points in time Multiple Base Operating Systems
  • 12.
    12RACKSPACE® HOSTING |WWW.RACKSPACE.COM OpenStack releases a new “stable” version every 6 months. • There are also 3 scheduled minor version releases for the stable version during the next 6 month cycle. • Although none of the upstream packages match these releases Deciding on your Release Schedule
  • 13.
    13RACKSPACE® HOSTING |WWW.RACKSPACE.COM We support the current “stable” version and one prior • So our customers do not have to update every 6 months • Upgrades are currently disruptive We currently release minor version updates every other month • Includes cookbook bug fixes • Includes latest available packages Deciding on your Release Schedule
  • 14.
    14RACKSPACE® HOSTING |WWW.RACKSPACE.COM This is probably one of the easiest decision to make and the hardest to get right • We have changed our model multiple times in the last 3 years • We will probably change it again  Go ask @claco Thursday 2:50pm Regency A/B about this Deciding on your Release Schedule
  • 15.
    15RACKSPACE® HOSTING |WWW.RACKSPACE.COM Major changes can occur every 6 months • Package renames • Service renames • New/Deprecated config options Services get split into their own projects • Nova-Volumes -> Cinder • Nova-Network -> Neutron (kind of) • Nova-Scheduler -> Gantt Managing Chef deployed OpenStack
  • 16.
    16RACKSPACE® HOSTING |WWW.RACKSPACE.COM Inter-release upgrades • Sweeping attribute key renames • node[„quantum‟] -> node[„neutron‟] • Have to remap customers environments • https://github.com/rcbops/mungerator • New BUGS! • That never happens.. it is a new version.. it must be a feature OS Upgrades • Terrible, terrible, and more terrible Managing Chef deployed OpenStack
  • 17.
    17RACKSPACE® HOSTING |WWW.RACKSPACE.COM Jenkins is used for testing and gating of our cookbooks • Syntax Verification (pep8, foodcritic) • Full deployment of OpenStack utilizing the cookbooks with the proposed changeset applied • Functional Tests of OpenStack • OpenStack API Testing • OpenStack CLI Testing Upstream opscode cookbooks are tested before inclusion Testing
  • 18.
    1818 RACKSPACE® HOSTING |5000 WALZEM ROAD | SAN ANTONIO, TX 78218 US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM QUESTIONS?
  • 19.
    19 19 RACKSPACE® HOSTING |5000 WALZEM ROAD | SAN ANTONIO, TX 78218 US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM