Cloud Orchestration is Broken

The Cloud is Broken
Those who ignore history are doomed
to repeat it
Edgar Román
emroman@pbs.org
March 3rd, 2015
DC Python Meetup

Caveats, Disclaimer, etc
• These are my opinions
• I am not yet omniscient so my knowledge of
tools mentioned may be inaccurate
• We’re really talking about Cloud Orchestration
• For moderate to complex environments (my blog
doesn’t count)
– Beyond web app / db

Our Architecture – V1
• Web App tier
– Runs code from git repo
• DB Master with slaves
– Hopefully managed by DDL in repo (i.e.
Django Migrations)
• Memcache/Redis layer
– Simple and self-configuring
• Celery Queue
– Asynchronous jobs, persistent queue
• Job worker pool

And more…
• Web App tier
– Lives in Auto-Scaling group
– Allows inbound tcp connections on 80/443 via load
balancer
• DB Master with slaves
– Only one inbound tcp port allowed
– Defined set of network connection for replication
• Memcache/Redis layer
– Restricted access to this from Web Apps only
• Celery Queue
– Web App can queue jobs, works can pop
• Job worker pool
– No inbound access at all!

Then we evolve
• V2
– Adds ElasticSearch tier
• V3
– Adds nightly Hadoop batch

Add some environments…
• Production, Staging, QA
• Then the devs want a local copy to work on

The challenge
• Production is on v1
• V2 is in QA
• Devs working on V3
And I need to manage them all quickly and easily

Philosophy Shift
• Olden days
– Used Visio to track changes to the physical
hardware
• Now
– Use tools to track multiple environments or
tiers in the cloud now
• Why not
– Create the entire architecture as needed,
preconfigured, and on-demand

If you create a single virtual entity in a
cloud without a script, it is like writing a
perl script on a server somewhere
without telling anyone

We’ve learned so much from software
development,
why can’t we use this knowledge for
cloud orchestration and management?

Modules / Decomposition
Versioning
Code Reuse / DRY
Abstraction
Compilations / Build Workflow

Modules / Decomposition
• We know from software:
– Grouping makes sense
– Helps organize logical sets of things
• What we have in cloud management:
– Default view of chef management consoles is
a flat list of nodes
– Vast majority of tutorials and examples put all
hosts in a single network
– AWS EC2, Chef, Ansible supports optional
groups by tagging
• Conclusion: Poor holistic support

Versioning
– Versioning is critical for tracking features and
bugs
– Allows recovery from errors, mistakes, and
disasters
– Versioning important not just at file level, but
whole project
– Ansible, Chef only version individual
playbooks/cookbooks, not
projects/environments/collections
– Restoring a known state for cloud project is a
manual process
• Conclusion: Poor holistic support

Code Reuse / DRY
– Repeating yourself causes bloat and often errors
when refactoring / updating code
– Updates in normalized code are easier and well
understood
– Minimal support for extra variables in
Ansible/Chef/Cloudformation per class of server
– Global variables for credentials
– Generally would need to cut/paste extra variables
in multiple places
• Conclusion: We’re getting there

Abstraction
• What we know from software:
– Using abstractions like file i/o allow use on
multiple platforms
– Mostly tools support multiple clouds (AWS,
Rackspace, etc)
– OpenStack is closest analogy to cloud
abstraction
• Conclusion: Very Promising

Compilation / Workflow
• What we know from software:
– Compilation of code enables easy transport
and packaging
– Enables DRY capabilities
– Workflow support is generally supported, but
not necessarily holistically or with versioning
of workflow support
• Conclusion: Not Bad

So…we should extend tools…
• to deal with not just servers, but networks and
other entities (abstraction)
• to manage collections of these entities
(modules)
• to manage versioning of these collections
(versioning)
• to allow configuration of these versioned
collections per environments (dry)
• to allow deployment (workflow) of these
versioned collections with configurations to
specific environments

Keep an eye on…
• Apache CloudStack
– http://cloudstack.apache.org/
• Cloudify
– http://getcloudify.org/

Questions?
Oh yeah, we’re hiring…

Cloud Orchestration is Broken

More Related Content

What's hot

Similar to Cloud Orchestration is Broken

More from Public Broadcasting Service

Cloud Orchestration is Broken

Editor's Notes