Deploying OpenStack in production at any scale, upgrade support is one of the requirements to have a successful deployment. Without upgrade management, adeployment will have bugs and security issues from day 1. Also in longer term, it will miss the latest features that OpenStack offers.
OpenStack Summit Vancouver: Lessons learned on upgrades
1. The importance of HA and automation tools
Frédéric Lepied
Engineering Manager
flepied@redhat.com
Lessons Learned On Upgrades
Senior Software Engineer
emilien@redhat.com
Emilien Macchi
2. Red Hat Cloud Innovation Practice Engineering
Frédéric Lepied: RCIP Engineering manager
Emilien Macchi: installer team / Puppet PTL
4. OpenStack is a wonderful place,
but upgrades are not easy.
5. What is a successful upgrade?
• No need of new hardware
• The less interruption possible
• Minor & Major upgrade support
• Efficient, fast, reproducible process
8. Enough free capacity
• Have enough compute
resources to migrate instances
• Have some spare in case of
failure
9. Image based workflow (recommended)
• Build your images once
• Install using your images
• Upgrade using your images
10. Build and archive your images
• Build your image in a CI
• Use packaging tools (yum, apt, …)
• Compression & archive
• Stamp with versioning
• Use Cloud Storage (Swift, Ceph)
Image based deployment
11. Limit the number of images
• More images = more pain
• Single image with:
• all packages installed
• all services disabled at boot
Image based deployment
12. Prohibit packaging tools
• Keep systems:
• consistent
• reproducible
• auditable
• Speed-up configuration
management
• Allow to re-enable the tools
Image based deployment
13. Upgrade your system with a tool
• APT / YUM:
• too slow at scale (~20 min / node)
• need to manage your repositories
• Using eDeploy:
• very fast at scale (~20 s / node)
• allow rollbacks
Image based deployment
15. Control system upgrade
We need:
• one command to upgrade one system
• no service restarted or reloaded
• possibility to rollback
What we use:
• eDeploy : tool to upgrade images with rsync
Automation tooling
16. Configuration management
• Puppet, Chef, Ansible, whatever you like
• “The best tool is the one you already use.”
• But:
• … you need to update your config
• … do not manage packages
Automation tooling
17. Orchestrator
• Puppet and Chef are good for configuration
• But you need to orchestrate multiple systems:
• restart services in the right order
• upgrade the system at the right time
Automation tooling
18. Upgrade workflow
Automation tooling
• Pre-upgrade actions
• Resources evacuation
• Stop OpenStack services
• Stop Infra / system services
• Upgrade packages
• Start Infra / system services
• Start OpenStack services
• Post-upgrade actions
19. Example: upgrade a compute node
• evacuate virtual machines
• disable nova compute service
• system upgrade
• update config
• service libvirtd restart
• service openstack-nova-compute restart
• enable nova-compute service
• test the service
Automation tooling
21. Automate the workflow
Automation tooling
• Upgrades are repetitive
• Prepare an upgrade without effort
• Prepare Ansible Playbooks with
snippets
• Compose Playbooks by computing:
• what is upgraded in the image
• which service is running on a node
22. Ansible best practices
• Use tags in snippets to define ordering
• Run HA nodes in serial
• Run compute nodes in parallel
• Use a script for hypervisor evacuation
• Allow to continue to roll playbooks after a
failure
• a snippet for each service to upgrade
Automation tooling
24. • Your OpenStack needs HA
• Make sure you have free capacity
• Image based upgrade is a good option
• Orchestration and Configuration Management are key
Conclusion