An overview of 20 automation projects within OpenStack. The presentation for OpenStack online meetup www.meetup.com/OpenStack-Online-Meetup/ Recording is at https://plus.google.com/u/0/events/ca0d20climslpjgm8dml1lft0p8
My name is DZ. I’ve been in the field of datacenter automation and cloud management for over 10 years. Now I’m a CTO and co-founder of St2 At St we do operation automations. We work with OS, we contribute to OS I’m personally fascinated byOS phenomena. It is expanding to reinvent pretty much everything in system software, sometimes going overboard, but but I believe natural forces and foundation governance will take care of it overtime. So… The devops movement is happening here. New approaches to automations are tried out here. Interesting projects and services are emerging around openstack, and within OpenStack. That’s what we want to talk about today.
We will go over operation automation projects within OpenStack.
We’ll go over Operation automation projects under OPENSTACK umbrella:
No vendor tools, no general opensource tools, however popular or relevant they are. Just within OPEN-STACK.
Further, filter out monitoring space - metering, monitoring, logging . And we are left with surprising 20 OpenStack projects.
One of the first open-source OpenStack deployment frameworks.
The original Crowbar has been developed by Dell as an openstack bare-metal installer. It’s a Rails application that does server discovery, provides firmware updates, installs basic OS via PXE Boot, and deploys OpenStack components via Chef.
A re-architected Crowbar v2 (aka OpenCrowbar) is the transition from installer into a tool that manages upgrades, continuous deployments and other ongoing operations. It is not hardwired to Chef and can use other config management tools. It is made to also deploy on virtual machines and containers.
Status: Crowbar is no longer OpenStack specific tool, I bring it up here as exception out of historic respect for the first openstack installer. Both v1 and v2 functional and mature. But not seen used much in the new deployments.
Fuel “The control plane for installing and managing OpenStack.”
Originally it was Mirantis’ proprietary solution. In 2013 is was open-sourced and contributed to OpenStack. Fuel is an orchestration layer on top of Puppet, MCollective, and Cobbler. It codifies Mirantis’ best practices of OpenStack deployment. Like other tools in this category, it does hardware discovery, network verification, OS provisioning and deploying of OpenStack components.
Fuel’s distinct feature is a polished and easy to use Web UI that makes OpenStack installation seem simple.
Status: First released in 2013, it is now OpenStack “Related” project. Fuel is seen used in the field a lot. OpenStack newbies often choose Fuel in their proof of concepts, attracted by the ease of use to get their cloud up and running. Mirantis’ consultants brought Fuel into some large production deployments.
Yet another OpenStack deployment tool, Developed by Huawei for their specific needs and opensourced as OpenStack Related project in Jan 2014. Compass developers position it as a simple, extensible data-driven platform for deployment, not limited to OpenStack. Through the plugin layer, it leverages other tools for hardware discovery, OS & Hypervisor deployment, and configuration management.
Status: “OpenStack Related” project. Used internally by Huawei, opensourced to StackForge on Jan 2014. I haven’t seen it used in the field just yet.
TripleO: TripleO installs, upgrades and operates OpenStack cloud using OpenStack own cloud facilities.
Yes, “it takes OpenStack to deploy OpenStack”.
In essense, TripleO is a dedicated OpenStack installation, called “the under-cloud”, that is used to deploy other OpenStack clouds – “overclouds” on bare metal.
The desired over-cloud configuration is described in a heat template, and the deployment orchestrated by heat. The nodes are provisioned on baremetal using Nova baremetal (Ironic): it pixie-boots the machine and installs images with OpenStack components. The images are dynamically generated with disk-image builder from image elements.
Operators enjoy using familiar OpenStack tools: Keystone authentication, Horizon dashboard and nova CLI, deploying and operating OpenStack cloud on hardware just like they deploy and operate a virtual environment.
TripleO targets ultra-large scale deployments (they say small deployments are solved by other tools) The sweet spot seems to be the continuous integration and continious deployment of multiple evolving OpenStack clouds, at hardware layer.
Status: TripleO is “Integrated” project. With all the traction in OpenStack community and support from RedHat, HP and others it seems to be a way to go long-term. The readiness status is a big question mark: on the one hand, it’s officially “Integrated” project, and, notably, used by HP Helion cloud as deployment tool. On the other hand, wiki and doc claims that it’s functional but work in progress. And it’s not seeing in production or even POC clouds. I expect it to be ready for prime time in K cycle, spring 2015.
* DevStack is the best known for the ease of bringing up a complete OpenStack cloud for developing or playing around. Not for production! * Other smaller deployment-related tools under OpenStack umbrella: PackStack: a utility that uses Puppet modules to deploy various parts of OpenStack on multiple pre-installed servers over SSH automatically. Surprisinlgy used a lot, according to the survey, it’s #4 after puppet, chef, and devstack. Warm: provides the ability to deploy OpenStack resources from Yaml templates. Not seen at all. Kickstart: Pure-Puppet wrapper to use StackForge Puppet modules to deploy OpenStack. Not seen at all. Inception: OpenStack in OpenStack for testing and playing. Anvil: DevStack version, written in Python. Supported by Yahoo.
Service to define a stack of resources as a template, and orchestrate their deployment and life cycle.
User defines a virtual infrastructure ‘stacks’ as a template, a simple YAML file describing resources and their relations – servers, volumes, floating ips, networks, security groups, users, etc. Given this template, Heat “orchestrates” the full lifecycle of a complete stack. It provisions the infrastructure, making all the calls to create the parts and wire them together. To make changes, user modifies the template and update existing stack, heat knows how to make the right changes. When the stack is undeployed, Heat deletes all the allocated resources.
Heat supports auto-scaling A monitoring event (e.g. Ceilometer alert) triggers the scaling policy, and Heat provisions extra instances into auto-scaling group.
Since Icehouse, Heat supports software configurations: user defines what software should be installed on the instance, and Heat weaves deploying and configuring it into the instance lifecycle. It is also possible to integrate Heat with configuration management tools like Puppet and Chef.
Heat also serves as a platform component for other OpenStack services. Heat is used as a deployment orchestration service by TrippleO and Solum.
Status: Integrated. It’s a hot project - strong community, strong support. Used in the filed, by OpenStack survey, it is seen at about ~10% of deployment.
Opinions split on Heat: some think Heat is the best, some think it’s a disaster. I don’t cast mine, just bring this point to your attention.
PaaS for developers. Application delivery platform, just like CloudFoundry and Heroku (in fact it supports Heroku and CloudFoundry artefacts!), but natively designed for OpenStack.
Solum deploys an application from a public git repository to an OpenStack cloud, to different language run-times. A YAML Plan file describes the topology of the Application. A service add-on framework will provide services for app to use, like MongoDB, MemCache, NewRelic, etc.
Solum pushes an application through Continuous Integration pipeline from the source code up to the final deployment to production via Heat template. (Picture here).
To do this, Solum leverages many OpenStack projects, including Heat, Nova, Glance, Keystone, Neutron, and Mistral.
Over time, Solum will guide and support developers through the dev/test/release cycle. It will support rollbacks to previous versions, monitoring, manual and auto-scaling and other good stuff.
Status: Solum is still in infancy: announced and started at Hong Cong summit, by Atlanta summit it only knows to do one very basic development pipeline. Most of the noted features are on the roadmap for year 2015. However it is a well-run community project with a strong team and solid support from RackSpace, RedHat, and a few significant others.
Murano Murano is an OpenStack self-service application catalog. It targets cloud end-users (including the least technical ones). If you’re familiar with traditional enterprise service catalog apps, like VMware VCAC or IBM Tivoli Service Request Manager, that’s about what Murano is.
Murano provides a way for developers to compose and publish high-level application – anything from a simple single virtual machine to a complex multi-tier app with auto-scaling. Murano provides a YAML based language to define an application, and the API and UI to publish it to the service catalog.
End users browse categorized catalog of applications through the self-service portal, and get their apps provisioned and ready-to use with a “push-of-a-button”
Murano Status: Murano is “OpenStack Related” project, likely to apply to “Incubating” in Juno release cycle. First released in May 2013, it is functional and stable. Murano has been already used in the field (typically brought in by Mirantis and customized by their professional services), especially by customers who need Windows-based environments.
Day two operations – is Maintaining and managing the cloud and workloads, keeping it running.
Responding to hardware failures and app performance degradations, troubleshooting, reactive and proactive maintenance, and other boring and mundane tasks that we really want to automate to spend time on more creative work…
This is a wide area, but the projects here are only emerging now.
Some, like Rubick, Blazar, Satori are solving specific, narrow use case. Others, like Mistral and Congress are general purpose automation tools.
I’ll skip details on smaller projects - Entropy, Gantt and Tetris – and just mention them here for completeness.
Blazar (ex. Climate)
Addresses a painpoint of most corporate private cloud: there is no incentive for the users to return resources. The concept of Lease and scheduling is a common solution.
Blazar manages the “lease” of cloud resources (virtual or physical), scheduling resource use in the future, negotiating lease terms between the user and the system, automating the process of allocating and releasing the resources, and providing visibility to resource consumption.
Nova and other services must be aware of lease concept, so Blazr introduces Nova filters and API extensions for lease and schedule.
Status: “Related”, Early but with basic funcationality in place, Currently implements reservation of virtual instances and physical hosts; with it’s pluggable architecture of resources is coming. Applying for Incubation.
Rule basic diagnostic tool for openstack configurations.
Rubick is a diagnostic tool that inspects and validates OpenStack cloud configurations for correctness and consistency, and reports any errors or misconfigurations.
Rubic auto-discovers an OpenStack cluster, extracts actual configurations of OpenStack components (Keystone, Cinder, Nova, etc), and checks them against a rule set to validate consistency and correctness. Some rules are simple syntax checks of configuration parameters. Other rules are more complex and inspect the entire model to find semantic inconsistencies across multiple OpenStack components. A simple web UI walks the user through the process of discovery and validation, and reports configuration errors and warnings.
Fixing misconfigurations, although a part of Rubick’s mission statement, but not in implementation, and doesn’t seem to be on the roadmap.
Status: OpenStack related product. It is functional and complete. I think what drags Rubic adoption is the content: it may become usable if the community jumps on creating the rules; as of now there is no rules out of box;
Satori provides configuration discovery for existing infrastructure
Given a URL and some credentials, Satori will discover the resource behind this URL, figure out it’s role, figure how this resource it related to the OpenStack cloud (e.g., it’s a Nova instance, or Cinder control node), and lists the services are running on this server.
With a pluggable implementation on the roadmap, Satori plans discover non OpenStack infrastructure: API for other clouds, nodes in a Chef server, operating system and application topologies, run time processes and relations between the systems.
Satori is conceptually similar to discovery tools like Ohai and Facter, and may leverage these tools, adding OpenStack specifics to it.
Status: Satori is a very young project started in 2014 and just had a first POC in March 2014.
Generic policy monitoring and enforcement framework.
Congress monitors the set of cloud services for policy compliance, and applies corrective actions when violations identified. It can even prevent violations from happening in the first place when possible. Conceptually, by making cloud service consult with Congress, but practically making this consultation implicit and stopping the operations which violate the policies whenever possible (e.g., cancelling the create operation of a non conforming VM)
The policies are declared using DataLog language. (takes a bit of getting used to, but makes sense)
Congress uses Data providers to connect with cloud services, fetch the relevant data, keep them up-to-date, and execute corrective actions.
As independent cross domain framework, Congress overcomes the limitation of domain specific policy enforcement efforts. It is capable of handling cross-domain policies, like “Every network attached to a VM must be a private network owned by someone in the same group as the VM owner” – touching Neutron, Nova, and Keystone.
Status: Related OpenStack project. Well thought out design. Very early implementation, not fully functional: basic architecture, policy language support and rudimentary datasource plugins for Nova and Neutron.
To be successful, it’ll need a buy in from OpenStack projects. So that they provide their own plugins for policy monitoring and enforcement.
Mistral is a workflow service for OpenStack cloud automation.
“Workflow based Run-book automation done right”.
A workflow – a sequence of tasks with transitions and conditional logic – is expressed as YAML based definition. A workflow can be triggered on demand, on schedule, or on a monitoring event. Mistral runs workflows at scale, with high availability and resilience. It executes task actions, keeps the workflow state, and carries data between the tasks.
Mistral offers an extensible set of actions, with SSH, REST HTTP, email, and OpenStack pack out of box. A basic UI is available as Horizon dashboard; a visual representation of workflow plan and execution is on the roadmap.
Mistral’s target users are cloud administrators to use workflow for automating the operational procedures, to integrate across cloud components, to integrate cloud services with business processes (e.g. JIRA, human intervention). Application developers can leverage Mistral as workflow service, similar to AWS Simple Workflow.
Mistral is also a platform component for other OpenStack services. Many need a concept of a workflow service. Solum, Fuel, Barbican, Murano, Keystone, Trove, Congress, and a few more are looking at integrating with Mistral.
Status: Mistral is a Related project, planning to apply to incubation in Juno cycle. It is currently at solid, functional pilot, going over series of internal refinings and refactorings.
Stackstorm – we know from experience that a workflow service is a key component of a holistic automation solution.
Cloud deployment: Solved proble. Given that so many stil do it it with raw Puppet/Chef
WORKLOAD: Why few? May be because many projects, many custom solutions outside? But Heat is here to stay, and Solum is interesting to watch.
DAY 2: This used to be a known field, before cloud and devops changed the landscape. Now the old approaches no longer work, the new has not quite established, this is the area where I expect to storm the most. Many approaches are coming, some competing, some are complimentary, the jury is still out on what’s the right way to do the day 2 automation.
OpenStack Automation Overview
CTO @ StackStorm Inc.
Pilot: Idea and “skeleton” implementation
Functional: Key use cases and
architecture in place
Feature-complete: MVP, set of use
cases implemented, stable to try
Production-used: seen used in clouds
Integrated: Official received OpenStack
• Mature. Open since 2013
• Used in the field
“The control plane
for installing and managing OpenStack”
• Contributed by Huawei in Jan 2014
• Not seen used except Huawei
“An open source project designed
to provide ‘deployment as a service’
to a set of bare metal machines.”
• Officially “Integrated”
• Functional, but work in progress (?)
• Part of HP Helion
“Installs, upgrades and operates OpenStack
cloud using OpenStack own cloud facilities”
•DevStack: favorite for development and
playing around with OpenStack
•PackStack: a utility that uses Puppet modules
to deploy OpenStack parts pre-installed servers
•Warm: provides the ability to deploy
OpenStack resources from Yaml templates.
•Inception: OpenStack in OpenStack for testing
•Anvil: DevStack in Python, by Yahoo.
• Integrated. Maturing, vivid community
• Used in the filed
• Platform for other OpenStack services
“Orchestration service to launch multiple
composite cloud applications using templates”
“Making cloud services easier to consume and
integrate into application development process”
• Early: 1st POC in Atlanta 2014
• Cool features target end ‘14 or year ‘15
• Strong, well-run community
“Making cloud services easier to consume and
integrate into application development process”
• Functional and stable
• Field-used (esp. for Windows services)
“OpenStack self-service application catalog”
Day 2 Operation Automation
Blazar (ex. Climate)
• Basic functionality in place
“OpenStack reservation as a service project”
• Functional and complete
• Lacks buy-in from other OpenStack projects
“Rule basic diagnostic tool
for OpenStack configurations”
• New (started 2014)
• In early development (first POC March ‘14)
“Provides configuration discovery for existing
“The open policy framework for the cloud”
Cross domain policy
attached to a VM
must be a private network
owned by someone
in the same group
as the VM owner
• Well thought out design
• Basic implementation - not fully functional
• Seeks buy-in from OpenStack services
“The open policy framework for the cloud”
• New - Pilot in Altanta ‘ 14
• Main functionality in place, refactoring
• Platform for other OpenStack services
“Workflow service for OpenStack cloud”
• Cloud deployment - solved problem
• Workload deployment – few – many
products and solutions outside of
• Day 2 automation – emerging,
many approaches, no winner yet
StackStorm on Automation
Join an online session, see
• Tuesday July 22nd, 11:30 AM Pacific time
• Thursday, July 24th, 8:00 AM Pacific time