CloudFoundryonOpenStack
AnExperienceReport
Introduction
about.me/fischerjulian
The anynines Stack
Hardware
OpenStack
Cloud Foundry
VMware
We migrated from a
Rented VMware to a
self-hosted OpenStack.
For more details on this:
http://rh.gd/a9vmw2sos
Things we had to
think about
OpenStack Upgrades
Before Grizzly
OpenStack was
not ready for production
• The upgrade process included a lot
manual work
• No script driven upgrades
• Manual DB schema migrations
• Manual configuration file changes, etc.
„I scheduled a week of
total downtime with all instances
offline.“- jon@jonproulx , http://rh.gd/1sNhiiz
Upcoming Upgrade
Havanna > Icehouse
• Chef is used to roll-out Icehouse (incl.
configuration changes)
• The upgrade is well tested on a separate
multi-server OpenStack staging system
Goal:
<30 min downtime.
Let’s cross fingers :)
Looking forward to rolling
Upgrades with OpenStack
Icehouse
http://rh.gd/1ymhViL
• No need to shutdown VMs during
upgrade
• No downtime of the entire cloud
VM availability
What killes VMs?
• Random kernel panics (kernel bug)

http://rh.gd/1oBUeCc
• Hardware outages (hw & power failures)
• …
Availability Zones
• Build disjunct networks, racks, etc.
• Each disjunct zone = availability zone
• Tell OpenStack about availability zones
• On provision you can choose the AZ
• Build Bosh releases accordingly
Aggregates
• Similar to AZ
• Not about failover
• Select hosts with certain attributes
• E.g. SSD-aggregate
• On provision choose host with SSD disks
Load Balancing
• Not inherently clustered
• LBaaS failover can be realized using
• Pacemaker/corosync
• GlusterFS (share LB configuration)
VM Failover Strategies
Resurrect
• Monitor VM
• Re-Build VMs automatically
• e.g. using Cloud Foundry Bosh
• + Easy
• - Takes long (minutes not seconds)
• - Open Stack doesn’t release persistent
disks automatically
Failover to Standby VM
• Provide stand-by VM
• Monitor VM and perform failover
• e.g. using Pacemaker
• + Fast failover (seconds)
• - Pacemaker is not easy to use
• - Increased resource usage by stdby
VM(s)
IP Failover
Three ways to failover IPs
Load Balancer
• + Fast
• + Easy (use lb weights)
• - LB becomes a bottleneck
• When OS supports HA Proxy (L3) this
drawback can be eliminated
Floating-IPs
• + Easy
• + Fast
• - Only for public networking
NIC Re-attachment
• + No network bottleneck
• + No dependencies to other services
• - Slightly higher failover time (several
seconds)
Implications for
Cloud Foundry
Accept that VMs are
ephemeral
Distribute CF components
across OS availability zones
• 2 * UAA
• 2 * CC
• 2 * n * DEAs
• 2 * Health Manager
• …
UAA & CC DB
=
SPOF
HA Postgres
• UAA and Cloud Controller database
• Single point of failure for Cloud Foundry
• Postgres not inherently clusterable >
failover with standby vm
• Master/slave replication
• Pacemaker/corosync
• IP-Failover using NIC-reattachment
That’s half way towards a
PostgreSQL CF Service
• Add a V2 Service Broker
• Add a provisioning logic
• Provision 2-node db cluster on

cf create service postgres medium-cluster
CF Service Design
• Use clusterable services if possible
• Implement automatic failover if not
• Autoprovisioning using Bosh
• Organize self-healing
• (Semi-)Automatic recovery from
degraded mode
Summary
• VMware’s high availability options are
nice
• OpenStack helped us to save 50% costs
• OS is stable enough to run Cloud
Foundry on top
• OS hardening is required and feasible
Open Source OpenStack
and Open Source Cloud
Foundry are SME’s best
friends!
Questions?
Thank you!
Preparing for
disaster recovery
• Cinder Volume Snapshots
OpenStack Backups
OpenStack Swift
• Open source Amazon S3 replacement
• Object store with RESTful interface
• Scales horizontally to petabyte
dimensions
• Fully redundant, highly available
• CF service > App Asset Storage
Code
require "fileutils"
require "find"
require "fog"
!
class Blobstore
  def initialize(connection_config, directory_key, cdn=nil, root_dir=nil)
    @root_dir = root_dir
    @connection_config = connection_config
    @directory_key = directory_key
    @cdn = cdn
  end
!
  def local?
    @connection_config[:provider].downcase == "local"
  end
!
  def exists?(key)
    !file(key).nil?
  end

Cloud Foundry on OpenStack - An Experience Report | anynines