Swan(sea) Song – personal research during my six years at Swansea ... and bey...
2013 10-25 dev-opsdays
1. Pragmatic steps on the
path to continuous delivery
Zero to Hero
@geoffnettaglich
2. Continuous Delivery : Theory
10 have idea
20 implement idea
30 measure effectiveness
40 refine idea
50 GOTO 10
60 PROFIT
build / deploy / feedback
automated
3. Continuous Delivery : Reality
● Everyone is doing it (apparently)
○ Amazon, Etsy, Facebook, Netflix
■ N deploys per day / min (where N > 0)
○ Great idea (in theory)
● What about me?
○ Existing codebase (smells and all)
○ Existing systems & environments (no access)
○ Existing team and politics (no idea)
inertia
4. Continuous Delivery : Practice
● Focus on the journey
○ it will be ongoing anyway
○ simplify / automate
Suck less tomorrow!
● Crawl / Walk / Run
○ Find a place to start
■ most painful
■ simplest entry point
5. Build : Theory
● Dependencies : what we need
lib of Jars / Maven / Ivy
● Artifacts : what we produce
tgz / jar / war
● Reproducible : scripted
Ant / Maven / Gradle
● Automatable
Jenkins
6. Build : Reality
● GOOD
○ Using ant + ivy + sh + junit
● NOT SO GOOD
○ Issues with build dependencies
○ No output artifact
○ No easy way to create DB
○ Only worked on developers machine
○ Took a day or two to get new team members working
7. Build : Practice
● Tidy up existing ant build (based on intent)
○ clean / compile / test / package
○ create / bootstrap DB locally
● Bundle build properties
● Run app locally
● Generate reports
○ Unit tests (JUnit)
○ Static analysis (FindBugs et al)
checkout / build / run
8. Deploy : Theory
● Push button / Zero downtime
○ Not always easy: Java != rails != php/apache
● Deploy from developers machine is FAIL
○ Shared env or jump box
○ Then automate (via Jenkins)
● What version is running where
○ Build page
○ All servers up to date with latest version
9. Deploy : Reality
● GOOD
○ Scripted (somewhat)
● NOT SO GOOD
○ Only from developers machine [FAIL!]
○ Ignored errors (FULL STEAM AHEAD)
○ Rsync of build outcome
○ Manual stop / start
○ Customers may see errors during rollout
10. Deploy : Practice
● bash script (Capistrano or similar)
○ stop / upload / unpack / restart
○ similar structure to capistrano on filesystem
● Verify
○ Build page with stats (bundled in artifact)
● DB changes
○ Backwards compatible changes (process)
○ Liquibase is great
● Automated (via Jenkins of course)
11. Run : Theory
● Elastic scaling
○ Dynamic provisioning
○ Automated service management
● Automated monitoring and alerting
○ Sensu / Nagios / ganglia
○ LogStash / GrayLog2
● Dashboards for EVERYTHING!
12. Run : Reality
● ssh to servers
○ to start stop services
○ monitor perf via top and ps
○ grep and tail -f of log files
● Only ‘qualified’ admins allowed
● No business metrics (manual reports)
● No visibility
● RESTART
WARNING!
Prone to human ERORS
13. ● New Relic
○ servers for free
○ app with paid
○ great insight
● Centralized logging
○ rsyslog:unification/collection
○ papertrail:aggregation & display
Run : Practice
14. ● Health Metrics (in app)
○ home rolled exposed via web
○ Metrics by CodaHale / Pingdom
● Biz Metrics
○ DucksBoard
■ simple funnel
■ attrition and activity
● Monit
○ alert and restart (if needed)
Run : Practice
15. ● Scale – vertically or horizontally
● automatable, reproducible
○ Green / Blue deploys FTW
● Chef / Puppet / Ansible ... just pick one
● If a server fails, can you rebuild it?
● If an environment fails, can you rebuild it?
Environments : Theory
‘Works on my machine’
16. Environments : Reality
● Private cloud hosting
● Limited upgrade ability
● Manual updates
● Uncertain/misunderstood foundation
● SSH and rsync to update and deploy
○ overwrote existing codebase
You won’t always know what you inherit!
17. Environments : Practice
● Picked Chef (Vi vs Emacs …)
○ Roles for baseline / Web / App / DB nodes
○ Environments for DEV / STG / PRD
● Vagrant for testing recipes locally
○ config checked into app project
● Automate (via Jenkins of course)
○ validate cookbooks via FoodCritic
○ upload to hosted chef
○ rollout / reprovision via bash / ssh
18. Environments : Practice
● DB backup / restore
○ do it and test it regularly
○ scripted
○ hire an expert
Delegate to others
● Mandrill for outgoing SMTP
● DNSimple for managed DNS
● Pingdom for simple uptime
○ custom health check endpoint
19. Rackspace Cloud
Hosted chef
Managed by Jenkins
○ Re-Create dev / staging / production
○ deploy staging / production
Monitored via
○ Pingdom
○ New Relic
○ PaperTrail app
Rolled out to iOS app dev TOO!
○ Xcode CLI / OCUnit / Testflight via Jenkins
Environment : Current State
20. OH: I deployed to production,
nobody noticed and nothing
went wrong
FTW!