Your SlideShare is downloading. ×
Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Saturn 2014. Engineering Velocity: Continuous Delivery at Netflix

1,348
views

Published on

At Netflix, we realize that there’s a tension between the availability of our service and our speed of innovation. If we move slowly, we can be very available -- but that’s not a good business …

At Netflix, we realize that there’s a tension between the availability of our service and our speed of innovation. If we move slowly, we can be very available -- but that’s not a good business proposition. If we move super fast, we risk downtime -- and that might annoy our customers. But
what if we could increase our velocity without significantly impacting availability? How can we shift that curve so that we’re moving faster without dropping any of those coveted 9’s?
How can we engineer velocity by weaving together tooling and culture with software development to expose and elevate highly effective practices? This talk describes various
components of Netflix’s continuous delivery platform -- much of which is available in open source. I’ll show how these pieces fit together and allow us to build scaffolding so that we’re comfortable with software developers making the decision to push the button for prod deployment -- and helps them to recover if necessary. As a result, we can run fast, trusting our tooling and our culture. I’ll also describe how we test our resiliency through simulating failure, unleashing the monkeys (Simian Army) on our production environment. Because if you’re afraid of cute little monkeys,
imagine how afraid you’ll be of a production environment that offers those same risks but doesn’t give you an opportunity to test your response to those dangers.

Throughout this talk, I hope that you will challenge yourself to consider how your company can "shift the curve" through tooling and to achieve a high velocity environment without negatively impacting reliability.

Published in: Software, Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,348
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Engineering Velocity: Continuous Delivery at Netflix Dianne Marsh SATURN 2014
  • 2. en-gi-neer-ing + ve-loc-i-ty ! applying science and technology to designing and building speed into a system
  • 3. Availability vs. Rate of ChangeAvailablity(in9’s) 0 1 2 3 4 5 6 Rate of Change 0 10 100 1000
  • 4. Shift the CurveAvailablity(in9’s) 0 1 2 3 4 5 6 Rate of Change 0 10 100 1000 10000
  • 5. http://www.slideshare.net/reed2001/culture-1798664
  • 6. Manager’s Role Context, not Control Loosely coupled, Tightly aligned And hire well!
  • 7. Get out of the Way Freedom to Innovate
  • 8. Support Experimentation ! How We Built a Predictive Autoscaling Engine http://techblog.netflix.com/2013/11/scryer-netflixs-predictive-auto-scaling.html
  • 9. Support Independent Paths of Exploration Don’t Prematurely Optimize!
  • 10. Blameless Culture
  • 11. Developers Deploy Their Code Run What You Wrote ! • Rapid Innovation • Rapid Detection • Rapid Response ! = Freedom + Responsibility
  • 12. Support with Tools
  • 13. Jenkins Job DSL Configuration as Code Groovy Script Scripts go in Version Control http://www.slideshare.net/quidryan/configuration-as-code
  • 14. Aminator Create AMI from Base AMI Image contains service and everything needed to run it Unit of Deployment for Test and Prod Abstracts Cloud Details http://techblog.netflix.com/2013/03/ami-creation-with-aminator.html
  • 15. Asgard Deploys Netflix to the Cloud Red/Black push Developed to address delays in rollback http://www.infoq.com/presentations/asgard
  • 16. Red/Black Push ! • Scale up new instances • Run canary analysis • Turn on traffic to new ASG • Turn off traffic to old ASG • Wait … analyze … continue
  • 17. Workflow Continuous Delivery Engine Judges between Stages Represent Best Practices http://techblog.netflix.com/2013/09/glisten-groovy-way-to-use-amazons.html
  • 18. One Click Deployment?
  • 19. Regional Isolation Limit Impact of Human Error ! • Stagger Deployments? • Canary Testing per Region? ! Know your Service!
  • 20. Multi-Region Consistency Build Tooling to: ! • Schedule Deployments • Prefer Off-Peak • Choose Next Available Region • Provide Visibility by Region
  • 21. Simian Army • Chaos Monkey • Latency Monkey • Conformity Monkey • Janitor Monkey 
 (and more) http://www.infoq.com/presentations/netflix-resiliency-failure-cloud
  • 22. Chaos Monkey Kills Running Instances • Simulates failures inherent to running in the cloud • In Production
  • 23. Latency Monkey Introduces Latency between services
  • 24. Conformity Monkey Have Deployments Diverged? • Balance Regional Consistency with Regional Isolation • Build Best Practices into Tooling and Reporting
  • 25. Janitor Monkey Reduce Cognitive Load and Cost • Remove unused instances • Uniform way to clean up
  • 26. Shifting the Curve with Tooling • Value Self-Service • Test Everywhere • Awareness of Multiple Regions • Best Practices Represented in Tooling • Recover Quickly and Easily • Be Cloud Native
  • 27. Shifting the Curve with Culture • Context not Control • Freedom to Experiment • Blameless Culture
  • 28. ArsTechnica, November 2012 “As the number of applications and the scale of the campaign's AWS infrastructure use climbed, the DevOps team shifted to using Asgard—an open-source tool developed by Netflix to manage cloud deployments.”
  • 29. Thanks! Dianne Marsh (@dmarsh) dmarsh@netflix.com