Why Everyone Needs DevOps Now: My Fourteen Year Journey Studying High Performing IT Organizations - Gene Kim, Author of The Phoenix Project

  • 874 views
Uploaded on

How do great IT organizations simultaneously deliver stellar service levels and fast flow of new features into production? It requires creating a “super-tribe”, where development, test, IT operations …

How do great IT organizations simultaneously deliver stellar service levels and fast flow of new features into production? It requires creating a “super-tribe”, where development, test, IT operations and information security genuinely work together to solve business objectives as opposed to throwing each under the bus. In this talk, Gene Kim will describe what successful development organization transformations look like, and how they were achieved from a Dev and Ops perspective. Drawing upon a 14 year study of high performing IT organizations, Gene will share the best known methods, recipes and case studies of how to implement successful DevOps-style transformations. See Gene Kim's Edge Presentation: http://www.akamai.com/html/custconf/edgetv-developers.html#gene-kim

The Akamai Edge Conference is a gathering of the industry revolutionaries who are committed to creating leading edge experiences, realizing the full potential of what is possible in a Faster Forward World. From customer innovation stories, industry panels, technical labs, partner and government forums to Web security and developers' tracks, there’s something for everyone at Edge 2013.

Learn more at http://www.akamai.com/edge

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
874
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
23
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Why Everyone Needs DevOps Now: My Fourteen Year Journey Studying High Performing IT Organizations Gene Kim Session ID: @RealGeneKim, genek@realgenekim.me
  • 2. Where Did The High Performers Come From? @RealGeneKim
  • 3. Visible Ops: Playbook of High Performers The IT Process Institute has been studying high-performing organizations since 1999 What is common to all the high performers? What is different between them and average and low performers? How did they become great? www.ITPI.org @RealGeneKim
  • 4. Act I: IT Ops Fixing Fragile Artifacts @RealGeneKim
  • 5. @RealGeneKim
  • 6. The Product Managers @RealGeneKim
  • 7. Act 2: The Developers @RealGeneKim
  • 8. @RealGeneKim
  • 9. @RealGeneKim
  • 10. IT Ops And Dev At War 10 @RealGeneKim
  • 11. Nothing Left For Infosec @RealGeneKim
  • 12. @RealGeneKim
  • 13. The Downward Spiral… 13 @RealGeneKim
  • 14. @RealGeneKim
  • 15. So, CEOs Don’t Trust IT… “If IT fails I don't know why… and if IT succeeds I don't know why.” “By managing inputs and outputs, I can hold any area of the business accountable – except for IT…” “Large investments in IT projects that eventual fail, without warning. And the CIO is the first to say, ‘I told you so.’” “I can’t hold IT accountable – IT is way too ‘slippery.’” Source: Gene Kim 2012 15 @RealGeneKim
  • 16. The IT Core Chronic Conflict Every IT organization is pressured to simultaneously: Respond more quickly to urgent business needs Provide stable, secure and predictable IT service 16 Source: The authors acknowledge Dr. Eliyahu Goldratt, creator of the Theory of Constraints and author of The Goal, has written extensively on the theory and practice of identifying and resolving core, chronic conflicts. @RealGeneKim
  • 17. Every Company Is An IT Company… 95% of all capital projects have an IT component… 50% of all capital spending is technology-related Where we need to be… IT is always in the way (again…) We are here… @RealGeneKim
  • 18. The Urgency Of This Business Problem “Of the Fortune 500 companies in 1955, 87% are gone... “In 1958, the Fortune 500 tenure was 61 years; now it’s 18 years…” –Richard Foster, “Creative Destruction” 18 @RealGeneKim
  • 19. How Team Obama’s tech efficiency left Romney IT in dust Obama campaign’s tech team beat Romney by using opposite strategy— “insourcing.” Even taken with the software and Web hosting expenses, the Obama campaign spent a seventh of what the Romney campaign spent on digital…. In the end, the deciding factor wasn’t what the Obama campaign spent money on, but what it did with all that money. Insourcing gave the campaign a strategic flexibility that the Romney campaign lacked…. “This is the difference...between a well run professional machine and a gaggle of amateurs.... I would be shocked if such a chasm exists next cycle between the parties—these aren’t mistakes to be repeated if you want to do things like win elections.” http://arstechnica.com/information-technology/2012/11/how-team-obamas-tech-efficiency-left-romney-it-in-dust/ 19 | Reimagining the Application Lifecycle
  • 20. Build. Measure. Learn. Technologies accelerate businesspractice changes The massive scope of its polling effort helped guide the Obama campaign in ways that would be impossible with conventional polling… three-day rolling-average tracking in each state. “We ran the election 66,000 times every night,” said a senior official, describing the computer simulations the campaign ran to figure out Obama’s odds of winning each swing state. “And every morning we got the spit-out — here are your chances of winning these states. And that is how we allocated resources.” Surveys used live interviewers, very large sample sizes and very short questionnaires, which focused on vote preference and strength of support, with no more than a handful of additional substantive questions. Hired campaign staff engineers from Facebook, Twitter, Google, Microsoft, and technology startups. http://www.theatlantic.com/technology/archive/2012/11/when-the-nerds-go-marching-in/265325/ http://www.huffingtonpost.com/2012/11/21/obama-campaign-polls-2012_n_2171242.html http://swampland.time.com/2012/11/07/inside-the-secret-world-of-quants-and-data-crunchers-who-helped-obama-win/
  • 21. Act 3: There Must Be A Better Way… 21
  • 22. Source: John Allspaw @RealGeneKim
  • 23. @RealGeneKim
  • 24. Source: John Allspaw @RealGeneKim
  • 25. @RealGeneKim
  • 26. Source: John Allspaw @RealGeneKim
  • 27. Source: John Allspaw @RealGeneKim
  • 28. Source: Theo Schlossnagle @RealGeneKim
  • 29. Source: Theo Schlossnagle @RealGeneKim
  • 30. Source: Theo Schlossnagle @RealGeneKim
  • 31. Source: John Jenkins, Amazon.com @RealGeneKim
  • 32. @RealGeneKim
  • 33. Who Is Doing DevOps? Google, Amazon, Netflix, Etsy, Akamai, Twitter, Facebook, Pinterest … BNY Mellon, Bank of America, World Bank, Paychex, Intuit… The Gap, Nordstrom, REI, Macy’s, GameStop, Target … Portland State University, Seton Hill University, Kansas State University… Who else? 33 @RealGeneKim
  • 34. High Performing DevOps Teams They’re more agile 30x more frequent deployments 8,000x faster lead time than their peers They’re more reliable 2x the change success rate 12x faster MTTR Source: Puppet Labs 2012 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic @RealGeneKim
  • 35. 35 @RealGeneKim
  • 36. How Can We Better Sell DevOps? 36
  • 37. Eric Passmore, former SVP Global Engineering, AOL (2007) 37 @RealGeneKim
  • 38. The Downward Spiral Operations Sees… Fragile applications are prone to failure Long time required to figure out “which bit got flipped” Detective control is a salesperson Too much time required to restore service Too much firefighting and unplanned work Planned project work cannot complete Frustrated customers leave Market share goes down Business misses Wall Street commitments Business makes even larger promises to Wall Street Dev Sees… More urgent, date-driven projects put into the queue Even more fragile code put into production More releases have increasingly “turbulent installs” Release cycles lengthen to amortize “cost of deployments” Failing bigger deployments more difficult to diagnose Most senior and constrained IT ops resources have less time to fix underlying process problems Ever increasing backlog of infrastructure projects that could fix root cause and reduce costs Ever increasing amount of tension between IT Ops and Development These aren’t IT Operations problems… These are business problems!
  • 39. Gene Kim, CTO, Tripwire, Inc. (2006) 39 @RealGeneKim
  • 40. The Downward Spiral Operations Sees… Fragile applications are prone to failure Long time required to figure out “which bit got flipped” Detective control is a salesperson Too much time required to restore service Too much firefighting and unplanned work Planned project work cannot complete Frustrated customers leave Market share goes down Business misses Wall Street commitments Business makes even larger promises to Wall Street Dev Sees… More urgent, date-driven projects put into the queue Even more fragile code put into production More releases have increasingly “turbulent installs” Release cycles lengthen to amortize “cost of deployments” Failing bigger deployments more difficult to diagnose Most senior and constrained IT ops resources have less time to fix underlying process problems Ever increasing backlog of infrastructure projects that could fix root cause and reduce costs Ever increasing amount of tension between IT Ops and Development These aren’t IT Operations problems… These are business problems!
  • 41. Anonymous Product Manager / UX (2011) 41 @RealGeneKim
  • 42. The Downward Spiral Operations Sees… Fragile applications are prone to failure Long time required to figure out “which bit got flipped” Detective control is a salesperson Too much time required to restore service Too much firefighting and unplanned work Planned project work cannot complete Frustrated customers leave Market share goes down Business misses Wall Street commitments Business makes even larger promises to Wall Street Dev Sees… More urgent, date-driven projects put into the queue Even more fragile code put into production More releases have increasingly “turbulent installs” Release cycles lengthen to amortize “cost of deployments” Failing bigger deployments more difficult to diagnose Most senior and constrained IT ops resources have less time to fix underlying process problems Ever increasing backlog of infrastructure projects that could fix root cause and reduce costs Ever increasing amount of tension between IT Ops and Development These aren’t IT Operations problems… These are business problems!
  • 43. Anonymous Infosec Officer (2012) 43 @RealGeneKim
  • 44. 44 @RealGeneKim
  • 45. @RealGeneKim
  • 46. “This book will have a profound effect on IT, just as The Goal did for manufacturing.” –Jez Humble, co-author Continuous Delivery “This is the IT swamp draining manual for anyone who is neck deep in alligators.” –Adrian Cockroft, Cloud Architect at Netflix “This is The Goal for our decade, and is for any IT professional who wants their life back.” –Charles Betz, IT architect, author “Architecture and Patterns for IT” 46 @RealGeneKim
  • 47. The First Way: Flow @RealGeneKim
  • 48. The First Way: Flow Understand the flow of work Always seek to increase flow Never unconsciously pass defects downstream Never allow local optimization to cause global degradation Achieve profound understanding of the system @RealGeneKim
  • 49. “Annual business planning sessions can be madding. They think IT Operations is an ‘all you can eat buffet.’” -Ben Rockwood, Director Systems Engineering, Joyent @RealGeneKim
  • 50. Define The Work and Make It Visible Business projects (e.g., new order system) Internal IT projects (e.g., configuration management, automation, debt reduction) Changes (e.g., deploys, improve database performance) Unplanned work (e.g., site down, site impaired) 50 @RealGeneKim
  • 51. Questions What is your lead time for changes? (i.e., how long does it take to go from “code committed” to “code successfully running in production”) How much of that is queue time vs. run time? 51 @RealGeneKim
  • 52. @RealGeneKim
  • 53. @RealGeneKim
  • 54. Create One Step Environment Creation Process Make environments available early in the Development process Make sure Dev builds the code and environment at the same time Create a common Dev, QA and Production environment creation process @RealGeneKim
  • 55. If I had a magic wand, I’d change the Agile sprints and definition of “done”: “At the end of each sprint, we must have working and shippable code, demonstrated in an environment that resembles production.” @RealGeneKim
  • 56. Deploy Smaller Changes, More Frequently * Decouple feature releases from code deployments Deploy features in a disabled state, using feature flags Require all developers check code into trunk daily (at least) Practice deploying smaller changes, which dramatically reduces risk and improves MTTR 56 @RealGeneKim
  • 57. Breaking The Bottlenecks In The Flow Environment creation Code deployment Test setup and run Overly tight architecture Development Product management 57 @RealGeneKim
  • 58. How organizations achieve high performance • 89% are using infrastructure version control • 82% are using automated code deployments 58 Source: Puppet Labs 2012 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
  • 59. Why Dedicated Teams Vs. Shared Services 59 @RealGeneKim
  • 60. @RealGeneKim
  • 61. Leankit Kanban 61 @RealGeneKim
  • 62. Blackboard Learn: 2005-Present 62 Source: David Ashman, Chief Architect, Blackboard, Inc. @RealGeneKim
  • 63. Blackboard Learn Building Blocks 63 Source: David Ashman, Chief Architect, Blackboard, Inc. @RealGeneKim
  • 64. The First Way: Outcomes Creating single repository for code and environments Determinism in the release process Consistent Dev, Test and Production environments, all properly built before deployment begins A continuous delivery pipeline that can be relied upon and daily Dev code commits Free ourselves from the learned behavior of catastrophic deployments Decreased lead time Reduce deployment times from 6 hours to 45 minutes Refactor deployment process that had 1300+ steps spanning 4 weeks Faster cycle time and release cadence @RealGeneKim
  • 65. The Second Way: Feedback @RealGeneKim
  • 66. The Second Way: Feedback Understand and respond to the needs of all customers, internal and external Shorten and amplify all feedback loops: stop the line when necessary Create quality at the source Create and embed knowledge where we need it @RealGeneKim
  • 67. Source: John Shook 67 @RealGeneKim
  • 68. “We found that when we woke up developers at 2am, defects got fixed faster than ever” – Patrick Lightbody, CEO, BrowserMob @RealGeneKim
  • 69. Require That Devs Manage Their Own Code For 6+ Months Source: Tom Limoncelli, Google 69 @RealGeneKim
  • 70. Test Whether Developers Qualify For IT Operations Resources Types/frequency of pager alerts Maturity of monitoring System architecture review Release process Defect counts and severity Production hygiene Source: Tom Limoncelli, Google 70 @RealGeneKim
  • 71. Return Fragile Services Back To Dev Source: Tom Limoncelli, Google 71 @RealGeneKim
  • 72. Feedback And Situational Awareness “Having a developer add a monitoring metric shouldn’t feel like a schema change.” – John Allspaw, SVP Tech Ops, Etsy 72 @RealGeneKim
  • 73. 73 @RealGeneKim
  • 74. 74 @RealGeneKim
  • 75. Integrating Into Continuous Delivery The days of reviewing RFCs in Word docs in change management meetings are over Failures must result in automated tests in the continuous deployment pipeline (Release, Config, Change) Invite or embed Ops into Dev standups and the scrum teams (“hey, we can sprint and scrums, too!”) @RealGeneKim
  • 76. Embed Dev Into IT Ops Embed Dev into IT Ops incident escalation process Put production monitoring in pre-production environments Invite Dev to post-mortems/root cause analysis meeting Have Dev and Infosec cross-train IT Operations Ensure application monitoring/metrics to aid in Ops and Infosec work (e.g., incident/problem management) @RealGeneKim
  • 77. What’s In It For Infosec And QA? 77 @RealGeneKim
  • 78. The Second Way: Outcomes Defects and security issues getting fixed faster than ever Standardized and reusable Ops and Infosec user stories now part of the Agile process All groups communicating and coordinating better Everybody is getting more work done @RealGeneKim
  • 79. The Third Way: Continual Experimentation And Learning @RealGeneKim
  • 80. The Third Way: Continual Experimentation And Learning Foster a culture that rewards: Experimentation (taking risks) and learning from failure Repetition is the prerequisite to mastery Why? You need a culture that keeps pushing into the danger zone And have the habits that enable you to survive in the danger zone @RealGeneKim
  • 81. Break Things Early And Often “Do painful things more frequently, so you can make it less painful… We don’t get pushback from Dev, because they know it makes rollouts smoother.” – Adrian Cockcroft, Architect, Netflix @RealGeneKim
  • 82. 82 @RealGeneKim
  • 83. Inject Failures Often @RealGeneKim
  • 84. You Don’t Choose Chaos Monkey… Chaos Monkey Chooses You @RealGeneKim
  • 85. Break Things Before Production Enforce consistency in code, environments and configurations across the environments Add your ASSERTs to find misconfigurations, enforce https, etc. Add static code analysis to automated continuous integration and testing process @RealGeneKim
  • 86. Reduce Technical Debt “The deal with engineering goes like this. Product management takes 20% of the capacity right off the top and gives this to engineering to spend as they see fit. Whatever is required to avoid, ‘we need to stop features to rewrite code. “If you’re in really bad shape today, you might need to make this 30% or even more of the resources. I get nervous when I find teams that think they can get away with much less than 20%.” – Marty Cagan, Inspired @RealGeneKim
  • 87. Allocate 20% Of Cycles To Technical Debt Reduction @RealGeneKim
  • 88. Recognize Compounding Technical Debt… @RealGeneKim
  • 89. That Gets Worse… @RealGeneKim
  • 90. And Fixing It… Source: Pingdom @RealGeneKim
  • 91. An Innovation Culture “By installing a rampant innovation culture, they now do 165 experiments in the three months of tax season. Our business result? Conversion rate of the website is up 50 percent. Employee result? Everyone loves it, because now their ideas can make it to market.” –Scott Cook, Intuit Founder 91 @RealGeneKim
  • 92. Convergence And Evolution Of Ideas Four Steps To The Epiphany, Steven Blank (2005) Principles Of Product Development Flow: Second Generation Lean Product Development, Donald Reinertsen (2009) Lean Startup, Eric Ries (2011) Lean UX, Jeff Gothelf (2013) 92 @RealGeneKim
  • 93. Performance by DevOps maturity Organizations that implemented DevOps practices over 12 months ago were 5x more likely to be high performing than organizations that weren’t implementing DevOps at all. 93 Source: Puppet Labs 2012 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
  • 94. Why Do I Think This Is Important? 94
  • 95. The Downward Spiral… 95 @RealGeneKim
  • 96. @RealGeneKim
  • 97. 97 @RealGeneKim
  • 98. If I Could Wave A Magic Wand, Everyone Will… See the suffering downstream, and have confidence that your intuitions and skills can make a profound and positive difference… Become conversant with DevOps and recognize the practices when you see them Be energized about how practitioners can contribute in this organizational journey Leave with some concrete steps to get some great outcomes Help create a team that starts putting DevOps practices into place 98 @RealGeneKim
  • 99. If I Could Wave A Magic Wand, Everyone Will… Become conversant with DevOps and recognize the practices when you see them Be energized about how practitioners can contribute in this organizational journey Leave with some concrete steps to get some great outcomes Become a part of a team that starts putting DevOps practices into place 99 @RealGeneKim
  • 100. “Some books you give to friends, for the joy of sharing a great novel. “Some books you recommend to your colleagues and employees, to create common ground. “Some books you share with your boss, to plant the seeds of a big idea. “The Phoenix Project is all three.” –Jeremiah Shirk, Integration & Infrastructure Manager at Kansas State University 100 @RealGeneKim
  • 101. Our Mission: Positively Impact The Lives Of One Million IT Workers By 2017 Free 170 page excerpt: http://itrevolution.com/the-phoenixproject-excerpt/ http://slideshare.net/realgenekim DevOps Defensive Audit Toolkit Enterprise DevOps Case Studies Early draft of upcoming “DevOps Cookbook” (Allspaw, DeBois, Edwards, Humble, Kim, Orzen) Email me at genek@realgenekim.me @RealGeneKim