From Code to the Monkeys: Continuous Delivery at Netflix

4,771 views

Published on

At Netflix, we continue to improve upon our continuous delivery process. We thrive in a hybrid environment, where every developer is able to deploy code, and with that freedom comes the responsibility for ensuring that our customers are not negatively impacted. We have constructed Open Source tools toward a Continuous Delivery solution. In this presentation, from QConSF 2013, you will learn about our tool chain so that you can determine which make sense in your environment.

0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,771
On SlideShare
0
From Embeds
0
Number of Embeds
274
Actions
Shares
0
Downloads
97
Comments
0
Likes
13
Embeds 0
No embeds

No notes for slide
  • Overview. Build, Bake, and DeployTesting.Monkeys: resilient to behaviors inherent in the cloudLeave with understanding of tools that we’ve built and open source.How you might be able to modify, augment or create
  • Innovate quicklythink outside of the box deploy solutions.Keep promise of availabilityEncourage best practicesrecommendations, not limitations.
  • Deploy to ProductionBalance innovation with riskSelf-service is scalableDon’t fix build configs, deployGareth Bowles, Agile 2013
  • Teams have unique flows.Let developers write codeJustin RyanJenkins Job DSL- pluginJava Posse Roundupfoundation for our build configurations at Netflix
  • Amazon Machine Images (AMIs)Aminate: source component is combined with another component to make something new
  • BaseAMI : common to all of our microservicesDeploy same image to test, prod, all regionsOther cloud platformsNetflixOSS logo
  • Self-serviceGroovy appRed/BlackGo through example
  • Don’t replace cluster.Spin up a new one.Canary/ ACA. Find problem or continueCloud native. Use the cloud.
  •  Scale up.Leave old cluster.Run through peak?Developer knows best
  • Groovy library that sits on top of SWFClay McCoyStart with a GAllow flow to shineActivity: element of reuse for our deployments. Builds on lessons from manual red/black deployments.
  • Stop now?Complete picture with runtime resiliencyAutomate all the things
  • Danger. Chaos ensues. Instances disappear.Latency happens. Litter.Find problems/build resiliencyIntroduce a fewMore ideas, need staff to build them!Look at vulnerabilities
  • Should we push to everywhere at once?
  • Multiple regions.Errors sometimes make it to productionLimit impactCost: innovation and speedDriftIncrease cognitive load
  • scheduled deployments. button push signifies the scheduling not necessarily the actual push. Providing visibility of what is deployed where, tied back to a Jenkins buildreduces that cognitive load.
  • We can do better. Look across regions:DriftDon’t nag.Use meaningful thresholds.Ask monkeys to help us test our runtimeBalance regional consistency with regional isolation.
  • Pay for only those instances we needDon’t bother developers with what automation can do better.
  • Full circleCode checkin to monkeysBalanced priorities
  • Can you use any of these elements?Share Cloud InfrastructureSolve our business problems!
  • From Code to the Monkeys: Continuous Delivery at Netflix

    1. 1. BUILD BAKE DEPLOY Continuous Delivery at Netflix: From Code to the Monkeys Dianne Marsh QCon San Francisco, November 2013 http://www.linkedin.com/in/diannemarsh
    2. 2. BUILD BAKE DEPLOY
    3. 3. Netflix Goals High Availability But … Move fast Tools encourage Best Practices But … Freedom to do the right thing
    4. 4. Teams Deploy Their Own Code Run What You Wrote  Rapid Innovation  Rapid Detection  Rapid Response = Freedom + Responsibility http://www.slideshare.net/garethbowles/self-servicebuilddeploymentagile2013
    5. 5. BUILD Jenkins Job DSL Configuration as Code Groovy Script Scripts go in Version Control http://www.slideshare.net/quidryan/configuration-as-code
    6. 6. BUILD BAKE
    7. 7. BAKE Aminator • Create AMI from Base AMI • Image contains service and everything needed to run it • Unit of Deployment for Test and Prod • Abstracts Cloud Details http://techblog.netflix.com/2013/03/ami-creation-with-aminator.html
    8. 8. BUILD BAKE DEPLOY
    9. 9. DEPLOY Asgard: AWS Deployment Tool Deploys Netflix to the Cloud Red/Black push http://www.infoq.com/presentations/asgard
    10. 10. CANARY ANALYSIS Test, Int, Prod Choose where to deploy Run canary analysis Scale up new instances Turn on traffic to new ASG Turn off traffic to old ASG Wait … analyze … continue
    11. 11. Asgard Developer Portal
    12. 12. GLISTEN Extending Asgard’s Workflow Automated Red/Black Push Test, Int, Prod stacks Run canary/analysis Scale up new instances Turn on traffic to new ASG Run more tests Turn off traffic to old ASG Wait … analyze … continue http://techblog.netflix.com/2013/09/glisten-groovy-way-to-use-amazons.html
    13. 13. BUILD BAKE DEPLOY
    14. 14. Simian Army • • • • • Chaos Monkey Latency Monkey Janitor Monkey Conformity Monkey (and more!) Test resiliency at runtime http://www.infoq.com/presentations/netflix-resiliency-failure-cloud
    15. 15. One Button Deployment?
    16. 16. Regional Isolation Limit Impact of Human Error  Stagger deployments  Canary testing per region
    17. 17. Multi-Region Consistency Build Tooling to:  Schedule Deployments  Prefer off peak  Choose next available region automatically  Provide high visibility per region
    18. 18. Send in the Conformity Monkey Have deployments diverged?  Balance regional consistency with regional isolation  Provide meaningful thresholds  Build best practices into tooling and reporting http://techblog.netflix.com/2013/05/conformity-monkey-keeping-yourcloud.html
    19. 19. Clean up with Janitor Monkey  Disassociate unused EIPs  Delete unassociated Amazon EBS volumes  Delete older Amazon EBS snapshots  Leverage Amazon S3 Object Expiration https://github.com/Netflix/SimianArmy/wiki/Janitor-Home
    20. 20. Key Elements for Netflix  Value Self-service  Test Everywhere  Build Awareness of Multiple Regions  Avoid peak times  Roll back quickly and easily  Be Cloud Native
    21. 21. Put NetflixOSS to Work for You Netflix Platform AMINATOR ** And 30+ more projects at http://netflix.github.io/
    22. 22. Keep the Conversation Going Continuous Delivery Open Space Ballroom B/C (here!) 1:35-2:25, immediately following lunch
    23. 23. Thanks! We’re always hiring! Dianne Marsh (@dmarsh) dmarsh@netflix.com

    ×