Continuous Deployment: The Dirty Details
Upcoming SlideShare
Loading in...5
×
 

Continuous Deployment: The Dirty Details

on

  • 16,388 views

Presented at ALM Summit 3 in Redmond, WA. January 2013. ...

Presented at ALM Summit 3 in Redmond, WA. January 2013.

Like what you've read? We're frequently hiring for a variety of engineering roles at Etsy. If you're interested, drop me a line or send me your resume: mike@etsy.com.

http://www.etsy.com/careers

Statistics

Views

Total Views
16,388
Views on SlideShare
15,963
Embed Views
425

Actions

Likes
46
Downloads
217
Comments
2

16 Embeds 425

http://beer32.tumblr.com 187
https://twitter.com 128
http://lanyrd.com 28
http://192.168.0.194 20
http://confluence.nehta.net.au 20
http://automatizationmakeitwork.blogspot.ru 13
http://automatizationmakeitwork.blogspot.com 9
http://mg.tl 8
http://draft.blogger.com 3
http://automatizationmakeitwork.blogspot.kr 2
http://automatizationmakeitwork.blogspot.de 2
http://elcurator.octo.com 1
http://www.bing.com 1
http://www.google.com 1
http://tweetedtimes.com 1
http://www.elcurator.net 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • I have seen very few presentations on continuous deployment address data schema changes. Thank you.
    Are you sure you want to
    Your message goes here
    Processing…
  • great
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Continuous Deployment: The Dirty Details Continuous Deployment: The Dirty Details Presentation Transcript

    • CONTINUOUS DEPLOYMENT The Dirty DetailsMike BrittainENGINEERING DIRECTOR @mikebrittain mike@etsy.com
    • CONTINUOUS DEPLOYMENT The Dirty Details“OK, sounds cool. But I have some questions...”
    • CD 100- & 200-levels- CI environment for automated tests- Committing to trunk- Branching in code- Config flags (a.k.a. feature flags)- DevOps mentality- Metrics and alerting- Automated deploy scripts credit: photobookgirl (flickr)
    • CD 100- & 200-levels- CI environment for automated tests- Committing to trunk- Branching in code- Config flags (a.k.a. feature flags)- DevOps mentality CD 300 level- Metrics and alerting - Deploys vs. releases- Automated deploy scripts - Decoupled systems, schema changes - How we work: Arch. & Process - Integration and Operations credit: photobookgirl (flickr)
    • www. .com
    • GROSS MERCHANDISE SALEShttp://www.etsy.com/blog/news/2013/notes-from-chad-2012-year-in-review/
    • DECEMBER 2012 1.5 Billion page views $117 Million of goods sold 6 Million items soldItems by anjaysdesigns, betwixxt, OneStarLeatherGoods, mediumcontrol, TheDesignPallet http://www.etsy.com/blog/news/2013/etsy-statistics-december-2012-weather-report/
    • 175+ Committers, everyone deployscredit: martin_heigan (flickr)
    • DEPLOYMENTS PER DAY Very end of 2009 Today
    • Continuous delivery is a pattern language in growing use in software development to improve the process of software delivery. Techniques such as automated testing, continuous integration, and continuous deployment allow software to be developed to a high standard and easily packaged and deployed to test environments, resulting in the ability to rapidly, reliably and repeatedly push out enhancements and bug fixes to customers at low risk and with minimal manual overhead. ~wikipediacredit: Stewart, redgen (flickr)
    • Architecture Stack Linux, Apache, MySQL, PHP Memcache, Gearman, Postgresql, Solr, Java, Traffic Server, Hadoop, HBase Git, Jenkinscredit: Saire Elizabeth (flickr)
    • Then Now 2009 2010-today Just before westarted using CD
    • Then Now 6-14 hours 15 mins“Deployment Army” 1 person Special event and Part of everydayhighly orchestrated workflow
    • Then NowBlocked for Blocked for6-14 hours. 15 minutes.6+ hours to 15 minutes to redeploy redeploy
    • Then Now Release branch, Mainline, database schemas, minimal linking data transforms, and building, packaging, rsync, rolling restarts, site up cache purging,scheduled downtime
    • 1st dayPut your face on Etsy.com.
    • 2nd day Complete tax, insurance, and benefits forms.credit: ktpupp (flickr)
    • WARNING
    • GROSS MERCHANDISE SALES
    • Continuous Deployment Small, frequent changes.Constantly integrating into production. 30+ deploys per day.
    • “Wow... 30 deploys a day.How do you build features so quickly?”
    • Software Deploy ≠ Product Launch
    • Deploys frequently gated by config flags (“dark” releases)
    • $cfg[‘new_search’] = array(enabled => off);$cfg[‘sign_in’] = array(enabled => on);$cfg[‘checkout’] = array(enabled => on);$cfg[‘homepage’] = array(enabled => on);
    • $cfg[‘new_search’] = array(enabled => off);
    • $cfg[‘new_search’] = array(enabled => off);// Meanwhile...# old and boring search$results = do_grep();
    • $cfg[‘new_search’] = array(enabled => off);// Meanwhile...if ($cfg[‘new_search’] == ‘on’) { # New and fancy search $results = do_solr();} else { # old and boring search $results = do_grep();}
    • $cfg[‘new_search’] = array(enabled => on);// or...$cfg[‘new_search’] = array(enabled => staff);// or...$cfg[‘new_search’] = array(enabled => 1%);// or...$cfg[‘new_search’] = array(enabled => users, user_list => mike,john,kellan);
    • Validate in production, hidden from public.
    • What’s in a deploy?Small incremental changes to the applicationNew classes, methods, controllersGraphics, stylesheets, templatesCopy/content changesTurning flags on, off, or % ramp up
    • Low MTTR (response times)Latent bugs and security holesTraffic management, load sheddingAdding and removing infrastructureTweaking config flags or releasing patches.
    • http://www.flickr.com/photos/flyforfun/2694158656/
    • Config flags Operator Metricshttp://www.flickr.com/photos/flyforfun/2694158656/
    • Favorites “on”
    • Favorites “off”
    • Many deploys eventually lead to a product launch.
    • “How do you continuously deploy database schema changes?”
    • Code deploys ~15-20 minutesSchema changes
    • Code deploys ~15-20 minutesSchema changes THURSDAYS!
    • Our web application is largely monolithic. Etsy.com, Support & Back-office tools, Developer API, Gearman (async work)
    • Our web application is largely monolithic. Etsy.com, Support & Back-office tools, Developer API, Gearman (async work) PHP, Apache, Memcache
    • External “services” are not deployed with the main application.e.g. Databases, Search, Photo storage, Payments
    • External “services” are not deployed with the main application. e.g. Databases, Search, Photo storage, Payments MYSQL PCI PROXY CACHE,(schema changes) (controlled access) FILERS, AMAZON S3 SOLR, JVM (specialized infra.) (rolling restarts)
    • For every config flag, there are two states we can support — present and future.
    • For every config flag, there are two states we can support — present and future. ... or past and present.
    • “Non-Breaking Expansions” Expose new version in a service interface;support multiple versions in the consumer.
    • Example: Changing a Database Schema Merging “users” and “users_prefs”
    • C RULE OF THUMB: Prefer ADDs over ALTERs (non-breaking expansion)
    • 1. Write to both versions2. Backfill historical data3. Read from new version4. Cut-off writes to old version
    • 0. Add new version to schema1. Write to both versions2. Backfill historical data3. Read from new version4. Cut-off writes to old version
    • 0. Add new version to schemaSchema change to add prefs columns to “users” table.“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “off”“read_prefs_from_users_table” => “off”
    • 1. Write to both versionsWrite code for writing prefs to the “users” table.“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “off”
    • 2. Backfill historical dataOffline process to sync existing data from “user_prefs”to new columns in “users”
    • 3. Read from new versionData validation tests. Ensure consistency both internallyand in production.“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “staff”
    • 3. Read from new versionData validation tests. Ensure consistency both internallyand in production.“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “1%”
    • 3. Read from new versionData validation tests. Ensure consistency both internallyand in production.“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “5%”
    • 3. Read from new versionData validation tests. Ensure consistency both internallyand in production.“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “on”(“on” == “100%”)
    • 4. Cut-off writes to old versionAfter running on the new table for a significant amountof time, we can cut off writes to the old table.“write_prefs_to_user_prefs_table” => “off”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “on”
    • “Branch by Astraction” Controller Controller Users Model (Abstraction)“users” (old) “user_prefs” “users” old schema new schema http://paulhammant.com/blog/branch_by_abstraction.html http://continuousdelivery.com/2011/05/make-large-scale-changes-incrementally-with-branch-by-abstraction/
    • “The Migration 4-Step”1. Write to both versions2. Backfill historical data3. Read from new version4. Cut-off writes to old version
    • “When do you clean up all of those config flags?
    • We might remove config flags for the old version when...It is no longer valid for the business.It is no longer stable, maintained, or trusted.It has poor performance characteristics.The code is a mess, or difficult to read.We can afford to spend time on it.
    • Promote “dev flags” to “feature flags”
    • // Feature flag$cfg[‘mobilized_pages’] = array(enabled => on);// Dev flags$cfg[‘mobile_templates_seller_tools’] = array(enabled => on);$cfg[‘mobile_templates_account_tools’] = array(enabled => on);$cfg[‘mobile_templates_member_profile’] = array(enabled => on);$cfg[‘mobile_templates_search’] = array(enabled => off);$cfg[‘mobile_templates_activity_feed’] = array(enabled => off);...if ($cfg[‘mobilized_pages’] == ‘on’ && $cfg[‘mobile_templates_search’] == ‘on’) { // ... // ...}
    • // Feature flags$cfg[‘search’] = array(enabled => on);$cfg[‘developer_api’] = array(enabled => on);$cfg[‘seller_tools’] = array(enabled => on);$cfg[‘the_entire_web_site’] = array(enabled => on);
    • // Feature flags$cfg[‘search’] = array(enabled => on);$cfg[‘developer_api’] = array(enabled => on);$cfg[‘seller_tools’] = array(enabled => on);$cfg[‘the_entire_web_site’] = array(enabled => on);$cfg[‘the_entire_web_site_no_really_i_mean_it’] = array(enabled => on);
    • Architecture and Process
    • Deploying is cheap.
    • Releasing is cheap.
    • Some philosophies on product development...
    • Gathering data should be cheap, too. staff, opt-in prototypes, 1%
    • Treat first iterations as experiments.
    • Get into code as quickly as possible.
    • “Where a new system concept or new technology is used, one has to build asystem to throw away, for even the best planning is not so omniscient as toget it right the first time. Hence plan to throw one away; you will, anyhow.” ~ Fred Brooks, The Mythical Man-Month
    • Architecture largely doesn’t matter.
    • Kill things that don’t work.
    • Your assumptions will be wrong once you’ve scaled 10x.
    • “We don’t optimize for being right. We optimize for quickly detecting when we’re wrong.” ~Kellan Elliott-McCrea, CTO
    • Become really good at changing your architecture.
    • Invest time in architecture by the 2nd or 3rd iteration.
    • Integration and Operations
    • WARNING REMEMBER THIS?
    • Continuous Deployment Small, frequent changes.Constantly integrating into production. 30 deploys per day.
    • Code review before commit
    • Automated tests before deploy
    • Why Integrate with Production?
    • Dev ≠ Prod
    • Verify frequently and in small batches.
    • Integrating with production is a test in itself. We do this frequently and in small batches.
    • More database servers in prod.Bigger database hardware in prod.More web servers.Various replication schemes.Different versions of server and OS software.Schema changes applied at different times.Physical hardware in prod.More data in prod.Legacy data (7 years of odd user states).More traffic in prod.Wait, I mean MUCH more traffic in prod.Fewer elves.Faster disks (SSDs) in prod.
    • Using a MySQL database in dev for an application that will be runningon Oracle in production: Priceless
    • Verify frequently and in small batches.
    • Dev ⇾ QA ⇾ Staging ⇾ Prod
    • Dev ⇾ QA ⇾ Staging ⇾ Prod
    • Dev ⇾ Pre-Prod ⇾ Prod
    • Test and integrate where you’ll see value.
    • Config flags (again)off, on, staff, opt-in prototypes, user list, 0-100%
    • Config flags (again)off, on, staff, opt-in prototypes, user list, 0-100% “canary pools”
    • Automated tests after deploy
    • Real-time metrics and dashboardsNetwork & Servers, Application, Business
    • SERVER METRICSApache requests/sec, Busy processes,CPU utilization, Script exec time (med. & 95th)APPLICATION METRICSLogins, Registrations, Checkouts,Listings created, Forum postsTime and event correlated.
    • Real humans reporting trouble!
    • “Theoretical” vs. “Practical” Engineering
    • Config flags Operator Metricshttp://www.flickr.com/photos/flyforfun/2694158656/
    • Managing risk during Holiday Shopping season Thanksgiving, “Black Friday,” “Cyber Monday” ➔ Christmas (~30 days) Code Freeze?
    • DEPLOYMENTS PER DAY “Code Slush”
    • Tighten your feedback cyclesIntegrate with production and validate early in cycle.Use tools that allow you to detect issues early.Optimize for quick response times.Applied to both feature development and operability.
    • Thank you ... and questions?These slides will be available later today at http://mikebrittain.com/talks Mike Brittain ENGINEERING DIRECTOR @mikebrittain mike@etsy.com