Continuous Deployment @rantav @outbrain
How much time does it take your code to meet the users?
How much time does it take your code to meet the users? More than a year?
How much time does it take your code to meet the users? More than a year? 6 - 12 Months?
How much time does it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months?
How much time does it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks?
How much time does it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks? 1 - 14 Days? 
How much time does it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks? 1 - 14 Days? 1 - 24 Hours?
How much time does it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks? 1 - 14 Days? 1 - 24 Hours?   < 12 Minutes? 
~ $ svn ci -m &quot;Implement the super-sharp image scaler.  #deloy:ImageServer   #to:ny &quot;
Outbrain enables readers to discover  the most interesting , relevant and timely links to stories (paid and organic)
 
Multi Billion Page Views per month Note: Outbrain is typically installed on *every* article/blog post on each of these sites, immediately under the content.
Who's in?
WHY
WHY For the Business For Fun
WHY What was so bad before ?
WHY What is a startup? A startup is always on a quest to find  Product-Market fit => You have to iterate Fast ... before running out of money
WHY Feedback loop speed is important (see  REPL  in programming)
WHY What's used to be the case before: Inefficient Waits (wait for QA, wait for other features etc) Inefficient Context Switch Feature Delayed - b/c of other features Big Changes  –  Big Problems.
HOW
HOW
HOW Continuous Deployment Themes Release is a marketing decision. Deployment is an engineering decision!
HOW Continuous Deployment Themes Small Changes  –  Reduce Risks Kent Beck: You can spill a bucket but you can't spill a Hose. 
HOW Continuous Deployment Themes Deploy fast, release often Fast turnarounds lead to happy customers ... and happy developers
HOW Culture Everyone  have to care about  everything ! build tests quality production monitoring business
HOW Culture No Broken Windows! Broken Windows Theory
HOW Culture “ What's the worse that could happen? ”
WORKFLOW
WORKFLOW Dev
WORKFLOW Dev
The  INGREDIENTS
INGREDIENTS Trunk Stable Everyone run tests before commit No branches, really Trunk may get released any moment. If you commit now, users will see it really soon.      => Test your code,  really good Use feature flags (but only if you must) Forward and backward compatibility
INGREDIENTS Automated Tests We have about 2000 test cases They run in < 4 minutes We use TeamCity We regularly test production Desired state: No manual QA whatsoever.
INGREDIENTS Infrastructure Automation &quot; Infrastructure as code &quot; Using kickstart script to install OS and Chef agent. All infrastructure is deployed by  Chef . All apps are deployed by  Glu All scripts, and chef cookbooks, glu configs are SVNed. Very easy to deploy large number of machines.  
INGREDIENTS Deployer
 
 
 
INGREDIENTS Servicization At outbrain we have ~ 25 services Each service is deployed at > 1 server If there's damage, it's contained It's easy to make small changes If there's an error it's easy to find it If rollback is needed it's easier Everything is either proxies (HAProxy) or queued (AMQ) Challenge: It's sometimes hard to find the right balance: number of services that need maintanance complexity performance api conformance
INGREDIENTS The Immune System Coding Review Testing Monitoring We use Nagios for the vitals Regularly check servers via instrumentation KeyNote Test Production (selenium,APT) Instrumentation Self-Test Performance  Monitoring KPIs PVs,  3 different CTRs, clicks, revenue, RPM etc...
INGREDIENTS The Immune System The line of defense will always be broken => Multiple lines of defense
INGREDIENTS Visibilty Everyone monitor the services  When there's a deployment, you see it on the graph
INGREDIENTS Visibilty Yammer
INGREDIENTS Visibilty svn changelog glu changelog glu audit log
INGREDIENTS Learn Fast, Adapt Fast 5 Ys
ARSENAL Outbrain's arsenal Architecture: Java in most parts, Spring for the most and Struts2 SOA (dah...) with REST Message queues (ActiveMQ) Service multiplicity with HAProxy Data stored in: MySQL, Cassandra, HDFS, NFS   Caching: Memcached
ARSENAL Outbrain's arsenal Testing: JUnit TeamCity Selenium Staging environment (for high risk deployments)
ARSENAL Outbrain's arsenal Deployment: Kickstart Chef Glu TeamCity In-house Deployment Manager RPM + YUM
ARSENAL Outbrain's arsenal Monitoring: Nagios &quot;Testing production&quot; every 10 minutes.  Keynote
ARSENAL Outbrain's arsenal Communications: Yammer
Fun Numbers 5-50 production changes a day!!! More then 2000 code tests running in less then 4 minutes. More then 600 production services tests runs every 10 minutes. It takes ~30 minutes from code complete to ~100 machines deployed.
References Why Continuous Deployment   / Eric Ries Continuous Deployment at outbrain   / Ran Tavory   Deployment Infrastructure for Continuous Deployment  / WealthFront Continuous Deployment presentation   / Eishay Smith Quantum of Deployment   / Etsy Chrome Release Cycle   / Anthony Laforge

Continues Deployment - Tech Talk week

  • 1.
  • 2.
    How much timedoes it take your code to meet the users?
  • 3.
    How much timedoes it take your code to meet the users? More than a year?
  • 4.
    How much timedoes it take your code to meet the users? More than a year? 6 - 12 Months?
  • 5.
    How much timedoes it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months?
  • 6.
    How much timedoes it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks?
  • 7.
    How much timedoes it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks? 1 - 14 Days? 
  • 8.
    How much timedoes it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks? 1 - 14 Days? 1 - 24 Hours?
  • 9.
    How much timedoes it take your code to meet the users? More than a year? 6 - 12 Months? 1- 6 months? 2 - 4 weeks? 1 - 14 Days? 1 - 24 Hours?   < 12 Minutes? 
  • 10.
    ~ $ svnci -m &quot;Implement the super-sharp image scaler. #deloy:ImageServer #to:ny &quot;
  • 11.
    Outbrain enables readersto discover the most interesting , relevant and timely links to stories (paid and organic)
  • 12.
  • 13.
    Multi Billion PageViews per month Note: Outbrain is typically installed on *every* article/blog post on each of these sites, immediately under the content.
  • 14.
  • 15.
  • 16.
    WHY For theBusiness For Fun
  • 17.
    WHY What wasso bad before ?
  • 18.
    WHY What isa startup? A startup is always on a quest to find Product-Market fit => You have to iterate Fast ... before running out of money
  • 19.
    WHY Feedback loopspeed is important (see REPL in programming)
  • 20.
    WHY What's usedto be the case before: Inefficient Waits (wait for QA, wait for other features etc) Inefficient Context Switch Feature Delayed - b/c of other features Big Changes – Big Problems.
  • 21.
  • 22.
  • 23.
    HOW Continuous DeploymentThemes Release is a marketing decision. Deployment is an engineering decision!
  • 24.
    HOW Continuous DeploymentThemes Small Changes – Reduce Risks Kent Beck: You can spill a bucket but you can't spill a Hose. 
  • 25.
    HOW Continuous DeploymentThemes Deploy fast, release often Fast turnarounds lead to happy customers ... and happy developers
  • 26.
    HOW Culture Everyone have to care about everything ! build tests quality production monitoring business
  • 27.
    HOW Culture NoBroken Windows! Broken Windows Theory
  • 28.
    HOW Culture “What's the worse that could happen? ”
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
    INGREDIENTS Trunk StableEveryone run tests before commit No branches, really Trunk may get released any moment. If you commit now, users will see it really soon.      => Test your code, really good Use feature flags (but only if you must) Forward and backward compatibility
  • 34.
    INGREDIENTS Automated TestsWe have about 2000 test cases They run in < 4 minutes We use TeamCity We regularly test production Desired state: No manual QA whatsoever.
  • 35.
    INGREDIENTS Infrastructure Automation&quot; Infrastructure as code &quot; Using kickstart script to install OS and Chef agent. All infrastructure is deployed by Chef . All apps are deployed by Glu All scripts, and chef cookbooks, glu configs are SVNed. Very easy to deploy large number of machines.  
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
    INGREDIENTS Servicization Atoutbrain we have ~ 25 services Each service is deployed at > 1 server If there's damage, it's contained It's easy to make small changes If there's an error it's easy to find it If rollback is needed it's easier Everything is either proxies (HAProxy) or queued (AMQ) Challenge: It's sometimes hard to find the right balance: number of services that need maintanance complexity performance api conformance
  • 41.
    INGREDIENTS The ImmuneSystem Coding Review Testing Monitoring We use Nagios for the vitals Regularly check servers via instrumentation KeyNote Test Production (selenium,APT) Instrumentation Self-Test Performance  Monitoring KPIs PVs,  3 different CTRs, clicks, revenue, RPM etc...
  • 42.
    INGREDIENTS The ImmuneSystem The line of defense will always be broken => Multiple lines of defense
  • 43.
    INGREDIENTS Visibilty Everyonemonitor the services  When there's a deployment, you see it on the graph
  • 44.
  • 45.
    INGREDIENTS Visibilty svnchangelog glu changelog glu audit log
  • 46.
    INGREDIENTS Learn Fast,Adapt Fast 5 Ys
  • 47.
    ARSENAL Outbrain's arsenalArchitecture: Java in most parts, Spring for the most and Struts2 SOA (dah...) with REST Message queues (ActiveMQ) Service multiplicity with HAProxy Data stored in: MySQL, Cassandra, HDFS, NFS   Caching: Memcached
  • 48.
    ARSENAL Outbrain's arsenalTesting: JUnit TeamCity Selenium Staging environment (for high risk deployments)
  • 49.
    ARSENAL Outbrain's arsenalDeployment: Kickstart Chef Glu TeamCity In-house Deployment Manager RPM + YUM
  • 50.
    ARSENAL Outbrain's arsenalMonitoring: Nagios &quot;Testing production&quot; every 10 minutes.  Keynote
  • 51.
    ARSENAL Outbrain's arsenalCommunications: Yammer
  • 52.
    Fun Numbers 5-50production changes a day!!! More then 2000 code tests running in less then 4 minutes. More then 600 production services tests runs every 10 minutes. It takes ~30 minutes from code complete to ~100 machines deployed.
  • 53.
    References Why ContinuousDeployment   / Eric Ries Continuous Deployment at outbrain   / Ran Tavory   Deployment Infrastructure for Continuous Deployment / WealthFront Continuous Deployment presentation / Eishay Smith Quantum of Deployment   / Etsy Chrome Release Cycle   / Anthony Laforge