• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs
 

Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs

on

  • 3,749 views

This presentation describes my interpretation of the Why and How of DevOps, and the key findings from my 15 year study of high-performing IT organizations, and how they simultaneously deliver stellar ...

This presentation describes my interpretation of the Why and How of DevOps, and the key findings from my 15 year study of high-performing IT organizations, and how they simultaneously deliver stellar service levels and rapid implementation of new features into the production environment.

Organizations employing DevOps practices such as Google, Amazon, Facebook, Etsy and Twitter are routinely deploying code into production hundreds, or even thousands, of times per day, while providing world-class availability, reliability and security. In contrast, most organizations struggle to do releases more every nine months.

He will present how these high-performing organizations achieve this fast flow of work through Product Management and Development, through QA and Infosec, and into IT Operations. By doing so, other organizations can now replicate the extraordinary culture and outcomes enabling their organization to win in the marketplace.

Statistics

Views

Total Views
3,749
Views on SlideShare
3,683
Embed Views
66

Actions

Likes
13
Downloads
194
Comments
1

5 Embeds 66

http://www.scoop.it 42
https://twitter.com 15
https://www.linkedin.com 7
http://www.docseek.net 1
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • My name is Gene Kim. My area of passion started when I was the CTO and founder of Tripwire in 1999. I started keeping a list that we called “Gene’s list of people with great kung fu.” These were the organizations that simutaneously… <br /> <br /> In the next 25 minutes, I’m really excited to share with you some of my key learnings, which I’m hoping that will not only be applicable to you, but that you’ll be able to put into practice right away, and get some amazing results. <br /> <br /> But let me tell you how my journey began…
  • [ picture of messy data center ] Ten minutes into Bill’s first day on the job, he has to deal with a payroll run failure. Tomorrow is payday, and finance just found out that while all the salaried employees are going to get paid, none of the hourly factory employees will. All their records from the factory timekeeping systems were zeroed out. Was it a SAN failure? A database failure? An application failure? Interface failure? Cabling error?
  • Source: http://biobreak.wordpress.com/2010/10/07/games-evangelism-dos-and-donts/ <br />
  • Who are they auditing? IT operations. <br /> <br /> I love IT operatoins. Why? Because when the developers screw up, the only people who can save the day are the IT operations people. <br /> Memory leak? No problem, we’ll do hourly reboots until you figure that out. <br /> <br /> Who here is from IT operations? <br /> <br /> Bad day: <br /> Not as prepared for the audit as they thought <br /> Spending 30% of their time scrambling, generating presentation for auditors <br /> Or an outage, and the developer is adamant that they didn’t make the change – they’re saying, “it must be the security guys – they’re always causing outages” <br /> Or, there’s 50 systems behind the load balancer, and six systems are acting funny – what different, and who made them different <br /> Or every server is like a snowflake, each having their own personality <br /> <br /> We as Tripwire practitioners can help them make sure changes are made visible, authorized, deployed completely and accurately, find differences <br /> Create and enforce a culture of change management and causality
  • EG Parts Unlimited, Inc. DBA Parts Unlimited in is serious trouble. Stock has tumbled 19% in the last 30 days, and is down 52% from its peak three years ago. The company continues to be outmaneuvered by their arch-rival, famous for their ability to anticipate and instantly react to customer needs. Parts Unlimited now trails the competition in sales growth, inventory turns and profitability. <br /> <br /> Parts Unlimited has been promising the release of a software, call “Phoenix” which – if they can ever get it release – should close the gap. It tightly integrates its retailing and e-commerce channels. Already years late, many expect the company to announce another program delay in their analyst earnings call next month. 20 million in, years late and the Board and the Investors are – let’s just say the natives are restless and are looking for heads. Which mean not only have some of the players been let go, and moved positions, but the board is looking at outsourcing and / or splitting up the company.. <br /> <br /> The board has given the team six months to make dramatic improvements.
  • Source: Flickr: birdsandanchors <br />
  • Who’s introducing variance? Well, it’s often these guys. Show me a developer who isn’t causing an outage, I’ll show you one who is on vacation. <br /> <br /> Primary measurement is deploy features quickly – get to market. <br /> <br /> I’ve worked with two of the five largest Internet companies (Google, Microsoft, Yahoo, AOL, Amazon), and I now believe that the biggest differentiator to great time to market is great operations: <br /> <br /> Bad day: <br /> We do 6 weeks of testing, but deployment still fails. Why? QA environment doesn’t match production <br /> Or there’s a failure in testing, and no one can agree whether it’s a code failure or an environment failure <br /> Or changes are made in QA, but no one wrote them down, so they didn’t get replicated downstream in production <br /> <br /> Believe it or not, we as Tripwire practitioners can even help them – make sure environments are available when we need them, that they’re properly configured correctly the first time, document all the changes, replicate them downstream
  • [ picture of messy data center ] Ten minutes into Bill’s first day on the job, he has to deal with a payroll run failure. Tomorrow is payday, and finance just found out that while all the salaried employees are going to get paid, none of the hourly factory employees will. All their records from the factory timekeeping systems were zeroed out. Was it a SAN failure? A database failure? An application failure? Interface failure? Cabling error?
  • Source: http://biobreak.wordpress.com/2010/10/07/games-evangelism-dos-and-donts/ <br />
  • So who are all these constituencies that we can help, and increase our relevance as Tripwire practitioners and champions? <br /> <br /> How many people here are in infosec? <br /> <br /> Goal: protect critical systems and data <br /> Safeguard organizational commitments <br /> Prevent security breaches, help quickly detect and recover from them <br /> <br /> Bad day: no security standards <br /> No one is complying <br /> Yes, we’re 3 years behind. “Whaddya gonna do about it?” <br /> <br /> Vs. we (Tripwire owner) can become more relevant and add value by help infosec by leveraging all the configuration guidance out there <br /> Measure variance between produciton and those known good states <br /> Trust and verify that when management says, we’ve trued up the configurations, they’ve actually done it <br /> <br /> Why? Now, more than ever, there are an ever increasing amount of regulatory and contractual requirements to protect systems and data
  • There are many ways to react to this: like, fear, horror, trying to become invisible… All understandable, given the circumstances… <br /> <br /> Because infosec can no longer take 4 weeks to turn around a security review for application code, or take 6 weeks to turnaround a firewall change. <br /> But, on the other hand, I think it’s will be the best thing to ever happen to infosec in the past 20 years. We’re calling this Rugged DevOps, because it’s a way for infosec to integrate into the DevOps process, and be welcomed. And not be viewed as the shrill hysterical folks who slow the business down. <br />
  • Tell story of Amazon, Netflix: they care about, availability, security <br /> It’s not a push, it’s a pull – they’re looking for our help (#1 concern: fear of disintermediation and being marginalized) <br />
  • Eran Feigenbaum <br /> <br /> Director of Security, Google Enterprise <br /> <br />
  • [ picture of messy data center ] Ten minutes into Bill’s first day on the job, he has to deal with a payroll run failure. Tomorrow is payday, and finance just found out that while all the salaried employees are going to get paid, none of the hourly factory employees will. All their records from the factory timekeeping systems were zeroed out. Was it a SAN failure? A database failure? An application failure? Interface failure? Cabling error?

Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs Presentation Transcript

  • @RealGeneKim Session ID: Gene Kim Why Everyone Needs DevOps Now: My Fifteen Year Journey Studying High Performing IT Organizations
  • @RealGeneKim IT Operations
  • @RealGeneKim
  • @RealGeneKim The Product Managers
  • @RealGeneKim The Developers
  • @RealGeneKim
  • @RealGeneKim
  • @RealGeneKim IT Ops And Dev At War 13
  • @RealGeneKim
  • @RealGeneKim
  • @RealGeneKim The Downward Spiral…
  • @RealGeneKim17 The IT Core Chronic Conflict  Every IT organization is pressured to simultaneously:  Respond more quickly to urgent business needs  Provide stable, secure and predictable IT service Source: The authors acknowledge Dr. Eliyahu Goldratt, creator of the Theory of Constraints and author of The Goal, has written extensively on the theory and practice of identifying and resolving core, chronic conflicts.
  • @RealGeneKim Every Company Is An IT Company…  95% of all capital projects have an IT component…  50% of all capital spending is technology-related We are here… Where we need to be… IT is always in the way (again…)
  • @RealGeneKim There Is A Better Way…
  • @RealGeneKim Google, Amazon, Netflix, Spotify, Etsy, Spotify, Twitter, Facebook…
  • @RealGeneKimSource: John Allspaw 10 deploys per day Dev & ops cooperation at Flickr John Allspaw & Paul Hammond Velocity 2009
  • @RealGeneKim
  • Source: John Allspaw Little bit weird Sits closer to the boss Thinks too hard Pulls levers & turns knobs Easily excited Yells a lot in emergencies
  • Source: John Allspaw
  • @RealGeneKimSource: John Allspaw Ops who think like devs Devs who think like ops
  • @RealGeneKimSource: John Allspaw Dev and Ops
  • @RealGeneKimSource: John Jenkins, Amazon.com
  • @RealGeneKim Making Changes When It Matters Most “By installing a rampant innovation culture, we performed 165 experiments in the peak three months of tax season.” –Scott Cook, Intuit Founder “Our business result? Conversion rate of the website is up 50 percent. Employee result? Everyone loves it, because now their ideas can make it to market.”
  • @RealGeneKim Who Is Doing DevOps?  Google, Amazon, Netflix, Etsy, Spotify, Twitter, Facebook …  CSC, IBM, CA, SAP, HP, Microsoft, Red Hat …  GE Capital, Nationwide, BNP Paribas, BNY Mellon, World Bank, Paychex, Intuit …  The Gap, Nordstrom, Macy’s, Williams-Sonoma, Target …  General Motors, Northrop Grumman, LEGO, Bosche …  UK Government, US Department of Homeland Security …  Kansas State University… Who else?
  • @RealGeneKim High Performers Are More Agile 30x 8,000x more frequent deployments faster lead times than their peers Source: Puppet Labs 2013 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
  • @RealGeneKim High Performers Are More Reliable 2x 12x the change success rate faster mean time to recover (MTTR) Source: Puppet Labs 2013 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
  • @RealGeneKim High Performers Win In The Marketplace 2x 50%more likely to exceed profitability, market share & productivity goals higher market capitalization growth over 3 years* Source: Puppet Labs 2014 State Of DevOps
  • @RealGeneKim Organizations with high performing DevOps organizations were 2.5x more likely to exceed profitability, market share and productivity goals… Source: Puppet Labs 2014 State Of DevOps …and had 50% higher market capitalization growth over 3 years…
  • @RealGeneKim 37
  • @RealGeneKim “This book will have a profound effect on IT, just as The Goal did for manufacturing.” –Jez Humble, co-author Continuous Delivery “This is the IT swamp draining manual for anyone who is neck deep in alligators.” –Adrian Cockroft, Cloud Architect at Netflix “This is The Goal for our decade, and is for any IT professional who wants their life back.” –Charles Betz, IT architect, author “Architecture and Patterns for IT”
  • @RealGeneKim The First Way: Flow
  • @RealGeneKim “deploys per day” vs. “lead time”
  • @RealGeneKim “What is your lead time for changes?” “How long does it take to go from code committed to code successfully running in production?”
  • IT’S A TRAP
  • @RealGeneKim
  • @RealGeneKim Create One Step Environment Creation Process  Make environments available early in the Development process  Make sure Dev builds the code and environment at the same time  Create a common Dev, QA and Production environment creation process
  • @RealGeneKim If I had a magic wand, I’d change the Agile sprints and definition of “done”: “At the end of each sprint, we must have working and shippable code… demonstrated in an environment that resembles production.”
  • @RealGeneKim 48 How organizations achieve high performance • 89% are using infrastructure version control • 82% are using automated code deployments Source: Puppet Labs 2012 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
  • @RealGeneKim Deploy Smaller Changes, More Frequently * Source: http://www.facebook.com/note.php?note_id=14218138919
  • @RealGeneKim Deploy Smaller Changes, More Frequently *  Decouple feature releases from code deployments  Deploy features in a disabled state, using feature flags  Require all developers check code into trunk daily (at least)  Practice deploying smaller changes, which dramatically reduces risk and improves MTTR
  • @RealGeneKim Experiment: Reducing Batch Size By 50% Source: Scott Prugh, Chief Architect, CSG, Inc. And the customer got the feature in half the time!
  • @RealGeneKim Breaking The Bottlenecks In The Flow  Environment creation  Code deployment  Test setup and run (mention @rohansingh)  Overly tight architecture  Development  Product management
  • @RealGeneKim “In November 2011, running even the most minimal test for CloudFoundry required deploying to 45 virtual machines, which took a half hour. This was way too long, and also prevented developers from testing on their own workstations. By using containers, within months, we got it down to 18 virtual machines so that any developer can deploy the entire system to single VM in six minutes.” — Elisabeth Hendrickson, Director of Quality Engineering, Pivotal Labs
  • @RealGeneKim Blackboard Learn: 2005-Present 54 Source: David Ashman, Chief Architect, Blackboard, Inc. LoC Commits
  • @RealGeneKim Blackboard Learn Building Blocks 55 Source: David Ashman, Chief Architect, Blackboard, Inc.
  • @RealGeneKim Top Predictors Of IT Performance (2014)  Version control of all production artifacts  Continuous integration and deployment  Automated acceptance testing  Peer-review of production changes (vs. external change approval)  High trust culture  Proactive monitoring of the production environment  Win-win relationship between Dev and Ops Source: Puppet Labs 2014 State Of DevOps
  • @RealGeneKim The First Way: Outcomes  Creating single repository for code and environments  Determinism in the release process  Consistent Dev, Test and Production environments, all properly built before deployment begins  Features being deployed daily without catastrophic failures  Decreased lead time  Faster cycle time and release cadence
  • @RealGeneKim The Second Way: Feedback
  • @RealGeneKim
  • @RealGeneKim How many times per day is the andon cord pulled in a typical day at a Toyota manufacturing plant? 3500 times per day
  • @RealGeneKim Why would Toyota do something so disruptive as stopping production thousands of times per day? “It’s the only way we can build 2,000 vehicles per day – that’s one completed vehicle every 55 seconds.”
  • @RealGeneKim "Automated tests transform fear into boredom." -- Eran Messeri, Google Google Dev And Ops (2013)  15,000 engineers, working on 4,000+ projects  All code is checked into one source tree (billions of files!)  5,500 code commits/day  75 million test cases are run daily
  • @RealGeneKim Developers Carry Pagers “We found that when we woke up developers at 2am, defects got fixed faster than ever” – Patrick Lightbody, CEO, BrowserMob “You build it, you run it.” – Werner Vogels CTO, Amazon
  • @RealGeneKim Developers Carry Pagers “As a developer, there has never been a more satisfying point in my career than when I wrote the code, I pushed the button to deploy it, I watched the metrics to see if it actually worked in production, and fixed it if it broke.” – Tim Tischler Director of Operations Engr, Nike, Inc.
  • @RealGeneKim Devs Initially Self-Manage Their Own Code 65Source: Tom Limoncelli, Google
  • @RealGeneKim Return Fragile Services Back To Dev 67Source: Tom Limoncelli, Google
  • @RealGeneKim Pervasive Production Telemetry “Having a developer add a monitoring metric shouldn’t feel like a schema change.” – John Allspaw, SVP Tech Ops, Etsy
  • @RealGeneKim 69
  • @RealGeneKim 70 People actually look at the logs! (Mention Verizon PCI Data Breach Study)
  • @RealGeneKim
  • @RealGeneKim One Of The Highest Predictors Of Performance
  • @RealGeneKim One Of The Highest Predictors Of Performance
  • @RealGeneKimSource: Puppet Labs 2014 State Of DevOps Can Large Orgs Adopt These Practices? Yes! (Automated testing, Continuous integration, proactive monitoring, even high trust cultures!) The only practice not being adopted is Peer Review vs. Change Approval!
  • @RealGeneKim Top Predictors Of IT Performance (2014)  Version control of all production artifacts  Continuous integration and deployment  Automated acceptance testing  Peer-review of production changes (vs. external change approval)  High trust culture  Proactive monitoring of the production environment  Win-win relationship between Dev and Ops Source: Puppet Labs 2014 State Of DevOps
  • @RealGeneKim The Second Way: Outcomes  Defects and security issues getting fixed faster than ever  Disciplined automated testing enabling many simultaneous small, agile teams to work productively  All groups communicating and coordinating better  Everybody is getting more work done
  • @RealGeneKim The Third Way: Continual Experimentation And Learning
  • @RealGeneKim Break Things Early And Often “Do painful things more frequently, so you can make it less painful… We don’t get pushback from Dev, because they know it makes rollouts smoother.” – Adrian Cockcroft, Former Architect, Netflix (Now Technology Fellow, Battery Ventures)
  • @RealGeneKim 80
  • @RealGeneKim Inject Failures Often
  • @RealGeneKim You Don’t Choose Chaos Monkey… Chaos Monkey Chooses You
  • @RealGeneKim Allocate 20% Of Cycles To Technical Debt Reduction
  • @RealGeneKim “By November 2011, Kevin Scott, LinkedIn’s top engineer, had had enough. The system was taxed as LinkedIn attracted more users, and engineers were burnt out. “To fix the problems, Scott, who’d arrived from Google that February, launched Operation InVersion. “He froze development on new features so engineers could overhaul the computing architecture. “`We had to tell management we’re not going to deliver anything new while all of engineering works on this project for the next two months,’ Scott says. “It was a scary thing.’”
  • @RealGeneKim
  • @RealGeneKim
  • Source: Pingdom
  • @RealGeneKim Why Do I Think This Is Important?
  • @RealGeneKim The Downward Spiral…
  • @RealGeneKim
  • @RealGeneKim Opportunity Cost Of Wasted IT Spending? $2,600,000,000,000.00 per year ($2.6 Trillion US)
  • @RealGeneKim DevOps Enterprise Summit  Save the date: October 21-23, 2014  DevOps Enterprise is a conference for horses, by horses  Macy’s, Disney, GE Capital, Blackboard, Telstra, US Department of Homeland Security, CSG, Raytheon, Ticketmaster/LiveNation, Capital One, Nordstrom, Union Bank of California  Leaders driving DevOps transformations will talk about  The business problem they set out to solve  The obstacles they had to overcome  The business value they created  Submit talks at: http://devopsenterprisesummit.com/
  • @RealGeneKim Our Mission: Positively Impact The Lives Of One Million IT Professionals By 2017  Free 170 page excerpt: http://itrevolution.com/the-phoenix-project-excerpt/  http://slideshare.net/realgenekim  DevOps Defensive Audit Toolkit: http://http://bit.ly/DevOpsAudit  Early draft of upcoming “DevOps Cookbook” (Allspaw, DeBois, Edwards, Humble, Kim, Willis)  Email me at genek@realgenekim.me
  • @RealGeneKim Our Mission: Positively Impact The Lives Of One Million IT Professionals By 2017  Free 170 page excerpt: http://itrevolution.com/the-phoenix-project-excerpt/  http://slideshare.net/realgenekim  DevOps Defensive Audit Toolkit: http://http://bit.ly/DevOpsAudit  Early draft of upcoming “DevOps Cookbook” (Allspaw, DeBois, Edwards, Humble, Kim, Willis)  Email me at genek@realgenekim.me
  • @RealGeneKimSource: Puppet Labs 2014 State Of DevOps Can Large Orgs Be High Performers? Yes. But orgs with 10,000+ employees 40% less likely to be high performing vs. 500 employee orgs…
  • @RealGeneKim Other Side Of Innovation 96