Cyansible
Blending Blue/Green Deployments
@Betterment
Techcrunch Disrupt
May 2010
90,000 customers,
more every minute.
Fastest growing automated
investing service
$2B+
Who are we?
betterment : investing
::
devops : engineering
it’s 2012.
let’s ship some code.
Betterment@2012
A Better Migration: From Snowflakes to Stormtroopers
Wednesday, July 22, 6:30p - 8:00p @ AWS Pop-up Loft |Shameless Plug:
it’s 2015.
let’s ship new code…
without interrupting
production
● Predictable
● Repeatable
● Minimal Human Interaction
● Zero User Interruption
● Contained Failure
Dream Delivery
Blue/Green Deployments
http://martinfowler.com/bliki/BlueGreenDeployment.html
DNSELB
Pre-flight Checklist
❏Stateless Servers
❏Server Health Check
❏Duplicate Full Stack
including RDS Replica
Wait. Two Databases?
“There's still the issue of dealing with missed transactions
while the green environment was live, but depending on
your design you may be able to...
● feed transactions to both environments in such a way
as to keep the blue environment as a backup when the
green is live. Or you may be able to...
● put the application in read-only mode before cut-over,
run it for a while in read-only mode, and then switch it to
read-write mode.”
http://martinfowler.com/bliki/BlueGreenDeployment.html
“Code𝓲 always works
on Schema𝓲+1”
(A.K.A. old code works on new schema)
Publish, Migrate, Deploy
Jenkins’ Job
1. Build
2. Test
3. Package
4. Publish
5. Run Migrations
6. Invoke Ansible
7. Cull Zombies
Ansible’s Job
1. Check for S3 deliverables
2. Spin up new EC2 Instance(s)
3. Apply role(s) to instance(s)
4. Find instance(s) in ELB
5. Add new instance(s) to ELB & tag
o status: in-use
6. Remove & tag instances
o status: zombie
Bootstrapping Ansible
./exec/ directory
contains Jenkins entry points
Ansible code lives
in the repo it provisions.
Publish it like an app.
./exec/brochure-deploy.sh
brochure-deploy.yml
you are
here.
00:00:10.888
provision-new-ec2-
instances
brochure-deploy >
identify-elb-
AZs
brochure-deploy > provision-new-ec2-instances >
ec2-
deploy
brochure-deploy > provision-new-ec2-instances >
AMI Hierarchy
brochure-deploy.yml
you are
here.
00:01:18.415
configure-instances
brochure-deploy >
roles/deploy/brochure/tasks/main.y
ml
brochure-deploy > configure-instances
brochure-deploy.yml
you are
here.
00:04:49.414
brochure-deploy.yml
you are
here.
00:04:50.188
find-instances-in-elb
brochure-deploy >
brochure-deploy.yml
you are
here.
00:04:50.911
add-instances-to-elb
brochure-deploy >
brochure-deploy.yml
you are
here.
00:05:06.295
decommission-instances
brochure-deploy >
brochure-deploy.yml
you are
here.
00:05:21.226
EC2
INSTANCE
● Predictable
● Repeatable
● Minimal Human Interaction
● Zero User Interruption
● Contained Failure
Dream Delivery Achieved
The Future
● Long Running Instances + Docker
o Huge speed improvement
● Post Monolith, Abandon Jenkins?
o Travis CI for Build/Test
o Tower for Deployment Orchestration
● Ansible Galaxy?
Questions?
alan@betterment.com
@nonrational
github.com/nonrational
careers@betterment.com
All code snippets & diagrams contained in this presentation are property of Betterment, but please learn from them.
All photographs / GIFs used in this presentation are someone else’s.
Street Fighter, Back To The Future, Indiana Jones, Futurama, and Arrested Development are someone else’s property
too.

Cyansible

Editor's Notes

  • #4 Background on who we are: Betterment is online investing service, helping people to better manage and grow their wealth through smarter technology disrupt the investing financial industry Largest and fastest growing automated investing service More than 90,000 customers and quickly growing
  • #5  Betterment is online investing service, helping people to better manage and grow their wealth through smarter technology Invest in a diversified portfolio Automate everything for you - from rebalancing and dividend reinvestment to automatic deposits Tax efficient
  • #7 very, very confusing at times and have the potential to waste a lot of your time before you figure out the right approach the dream of devops is to create an environment where the path of least resistance also yields the most efficient, sane result. betterment is totally onboard with that mission from the ground up.
  • #8 it’s gonna take all night it’s friday night. i have no date, a 2-liter bottle of shasta, and my all-rush mixtape. let’s rock.
  • #9 Deploy Once a Month At 2AM Unpredictable pre-prod Rented Iron By definition, long-running LoadBalancer with “status.txt” file Manual Package Installation “Python2.4 must be default”
  • #11 AWS VPC AZ Subnets EC2 ELB Multi-AZ RDS
  • #14 predictable deployments
  • #16 DNS Update Unpredictable TTLs Never for Consumer Traffic Okay for Internal Traffic Elastic Load Balancer Re-use “pre-warmed” long-running ELBs Multiple AZs Health Checks Connection Draining Sticky Sessions
  • #17 Goes without saying you need HTTPS long-running machine columns still are a pain to provision. give me machines on tap. Ansible means i can stand up stacks of Stateless Sessions / Servers Stateful Authentication Cookies HTTPS Everywhere Column Health Awareness Health Check Routes Two Full-stack “Columns” Rent 2x N-machine columns Elastic Cloud + Ansible 2x Web / App / Database Nodes
  • #20 New EC2 Cluster hits production DB Migration Constraints Old Code always works on New Schema Win: Simplicity Lose: “Instant” Standby Stack
  • #21 Don’t optimize for rollbacks Fast Rollforward Speedy Fix&Ship Fast Failure curl healthcheck route before publicizing fail tolerance percent: 0% Emergency Rollback Ship previous git-hash.
  • #24 ansible parallelization
  • #26 super ugly, but awesome. ensure that hash matches master scalability!
  • #27 Monolith ⇒ SOA 2 ⇒ ~10 applications One repo to rule them all Version Ansible w/ Apps .tgz,.war deliverables ./exec ./roles/{deploy,provision} ./playbooks/shared
  • #35 TODO:
  • #41 Y’all ready for this?
  • #47 CYAN!
  • #53 Sometimes, things do go wrong.
  • #54 database still at v1, old code continues to run….
  • #55 If the deployment fails halfway through, zombie killer will take care of the “ready” or “pending” instances that are not in ELBs. If the deployment succeeds, instances tagged “zombie” will also be reaped.
  • #56 predictable deployments