Your SlideShare is downloading. ×
Engineering Velocity: Shifting the Curve at Netflix
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Engineering Velocity: Shifting the Curve at Netflix

4,407
views

Published on

QCon New York keynote

QCon New York keynote

Published in: Engineering, Technology

0 Comments
18 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,407
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
49
Comments
0
Likes
18
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Engineering Velocity: Shifting the Curve at Netflix Dianne Marsh (@dmarsh) QConNY 2014
  • 2. en-gi-neer-ing + ve-loc-i-ty ! applying science and technology to designing and building speed into a system
  • 3. Availability vs. Rate of Change Availablity(in9’s) 0 1 2 3 4 5 6 Rate of Change 0 10 100 1000
  • 4. Shift the Curve Availablity(in9’s) 0 1 2 3 4 5 6 Rate of Change 0 10 100 1000 10000
  • 5. Free the People. Optimize the Tools.
  • 6. Culture Freedom and Responsibility http://www.slideshare.net/reed2001/culture-1798664
  • 7. With Freedom comes Responsibility
  • 8. Managers’ Role Context, not Control Loosely coupled, Tightly aligned Attract and retain great talent!
  • 9. Get out of the Way Freedom to Innovate
  • 10. Support Experimentation ! How We Built a Predictive Autoscaling Engine http://techblog.netflix.com/2013/11/scryer-netflixs-predictive-auto-scaling.html
  • 11. Support Independent Paths of Exploration
  • 12. Build a Blameless Culture
  • 13. Developers deploy their own code Rapid Innovation Detection Response
  • 14. Optimize your Tools
  • 15. Netflix Build Language • Based on Gradle • Internal and Open Source • Gradle Summit talk: http://www.slideshare.net/quidryan/gradle-summit-2014-nebula https://github.com/nebula-plugins
  • 16. Jenkins Job DSL Configuration as Code Groovy Script Scripts go in Version Control http://www.slideshare.net/quidryan/configuration-as-code
  • 17. Aminator Create AMI from Base AMI Image contains service and everything needed to run it Builds Unit of Deployment for Test and Prod Abstracts Cloud Details http://techblog.netflix.com/2013/03/ami-creation-with-aminator.html
  • 18. Asgard Deploys Netflix to the Cloud Red/Black push Developed to address delays in rollback http://www.infoq.com/presentations/asgard
  • 19. Red/Black Push • Scale up new instances while running the old version • Cloud Native • Turn on traffic to new ASG • Canary Analysis • Turn off traffic to old ASG • Wait … Analyze … Roll Back?
  • 20. Canary Analysis ! • Production Deployment Pattern • Compare Metrics vs. Baseline Version • “Canary Analyze All The Things: How we learned to Keep Calm and Release Often”, Roy Rapoport www.slideshare.net/royrapoport/20140612-q-con-canary-analysis
  • 21. Continuous Delivery Workflow Support the Journey Judges between Stages Represent Best Practices http://techblog.netflix.com/2013/09/glisten-groovy-way-to-use-amazons.html
  • 22. One Click Deployment?
  • 23. Regional Isolation Limit Impact of Human Error • Stagger Deployments? • Canary Testing per Region? Know your Service!
  • 24. Multi-Region Consistency Build Tooling to: • Schedule Deployments • Prefer Off-Peak • Choose Next Available Region • Provide Visibility by Region
  • 25. http://www.infoq.com/presentations/netflix-resiliency-failure-cloud
  • 26. Chaos Monkey Kills Running Instances • Simulates failures inherent to running in the cloud • In Production
  • 27. Latency Monkey Introduces Latency between services
  • 28. Conformity Monkey Have Deployments Diverged? • Balance Regional Consistency with Regional Isolation • Build Best Practices into Tooling and Reporting
  • 29. Janitor Monkey Reduce Cognitive Load and Cost • Remove unused instances • Uniform way to clean up
  • 30. Shifting the Curve with Tools at Netflix • Value Self-Service • Test Everywhere • Awareness of Multiple Regions • Best Practices Represented in Tooling • Recover Quickly and Easily • Be Cloud Native • Respect the Journey
  • 31. Shifting the Curve with Culture at Netflix • Free the People! • Context not Control • Freedom to Experiment • Blameless Culture
  • 32. ArsTechnica, November 2012 “As the number of applications and the scale of the campaign's AWS infrastructure use climbed, the DevOps team shifted to using Asgard—an open-source tool developed by Netflix to manage cloud deployments.”
  • 33. Thanks! Dianne Marsh (@dmarsh) dmarsh@netflix.com