I have a love of operations and what it takes to keep online services up and running.
Who’s introducing variance? Well, it’s often these guys. Show me a developer who isn’t causing an outage, I’ll show you one who is on vacation.Primary measurement is deploy features quickly – get to market.
How each side Actively impedes the achievement of each other’s goals.
2011 03 14 dev ops meetup - top lessons creating dev-ops super-tribes 2b
Visible Ops: Playbook of High Performers<br />The IT Process Institute has been studying high-performing organizations since 1999<br />What is common to all the high performers?<br />What is different between them and average and low performers?<br />How did they become great?<br />Answers have been codified in the Visible Ops Methodology<br />Over 180K copies sold since 2006<br />www.ITPI.org<br />
Now, More Than Ever, We Need Great IT Operations<br />In addition to delivering the online services we promised, when the business needs to take corrective actions to:<br />Reduce costs<br />Increase efficiencies<br />Gain competitive advantage<br />Where we need to be…<br />IT is always in the way(again…)<br />We are here…<br />
6 times moreapplications</li></ul>Source: IT Process Institute, 2008<br />
The Vicious Downward Spiral<br />Operations Sees…<br />Fragile applications are prone to failure<br />Long time required to figure out “which bit got flipped”<br />Detective control is a salesperson<br />Too much time required to restore service<br />Too much firefighting and unplanned work <br />Planned project work cannot complete<br />Frustrated customers leave<br />Market share goes down<br />Business misses Wall Street commitments<br />Business makes even larger promises to Wall Street<br />Dev Sees…<br />More urgent, date-driven projects put into the queue<br />Even more fragile code put into production<br />More releases have increasingly “turbulent installs”<br />Release cycles lengthen to amortize “cost of deployments”<br />Failing bigger deployments more difficult to diagnose<br />Most senior and constrained IT ops resources have less time to fix underlying process problems<br />Ever increasing backlog of infrastructure projects that could fix root cause and reduce costs<br />Ever increasing amount of tension between IT Ops and Development<br />These aren’t IT Operations problems…These are business problems!<br />
Operations Inside The Dev/Ops Super-Tribe<br />Increase flow from Dev to Production<br />Increase throughput<br />Decrease WIP<br />Our goal is to create a system of operations that allows <br />Planned work to quickly move to production<br />Ensure service is quickly restored when things go wrong<br />How does this relate to Visible Ops?<br />We focused much on “unplanned work”<br />What’s happening to all the planned work?<br />At any given time, what should IT Ops be working on?<br />Now we are focusing on the flow of planned work<br />
Zone #1: Decrease Cycle Time Of Releases<br />Create determinism in the release process<br />Move packaging responsibility to development<br />Release early and often<br />Decrease release cycle time<br />Reduce deployment times from 6 hours to 45 minutes<br />Refactor deployment process that had 1300+ steps spanning 4 weeks<br />Never again “fix forward,” instead “roll back,” escalating any deviation from plan to Dev<br />Verify for all handoffs (e.g., correctness, accuracy, timeliness, etc…)<br />Ensure environments are properly built before deployment begins<br />Control code and environments down the preproduction runways<br />Hold Dev, QA, Int, and Staging owners accountable for integrity<br />
Zone #2: Increase Production Rigor<br />Define what work is and where work can come from<br />Protect the integrity of the work queue (e.g., are checks being written than won’t clear?)<br />To preserve and increase throughput, elevate preventive projects and maintenance tasks<br />Document all work, changes and outcomes so that it is repeatable<br />Ops builds Agile standardized deployment stories, to be completed after Dev sprints are complete<br />Maintains adequate situational awareness so that incidents could be quickly detected and corrected<br />Standardize unplanned work and escalations<br />Always seeking to eradicate unplanned work and increase throughput<br />Lean Principle: “Better -> Faster -> Cheaper”<br />
“When IT Fails: The Novel”<br />Steve Masters, CEO<br />Bill Palmer, VP IT Operations<br />Chris Anderson, VP Development<br />Parts Unlimited$4B revenue/year<br />
Resources<br />From the IT Process Institute www.itpi.org<br />Both Visible Ops Handbooks<br />ITPI IT Controls Performance Study<br />“Lean IT” by Orzen and Bell<br />Winner of the Shingo Prize 2011<br />“Web Operations: Keeping The Data On Time” by Allspaw, Robbins<br />“Inspired: How To Create Products That Customers Love” by Cagan<br />“Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation” by Humble, Farley<br />Follow Gene Kim<br />On Twitter: @RealGeneKim<br />mailto:firstname.lastname@example.org<br />Blog: http://realgenekim.me/blog<br />Follow Mike Orzen<br />On Twitter: @MikeOrzenLeanIT<br />mailto:email@example.com<br />http://www.steadyimprovement.com<br />
About Gene Kim<br />I’ve spent the last 10 years studying high performing IT organizations, trying to understand:<br />What do they have in common?<br />What is present in successful transformations, absent in unsuccessful transformations?<br />How do we lower the activation energy required to create the transformations?<br />Founder and former CTO of Tripwire, Inc., a $100M automated security/compliance software company<br />Co-author of Visible Ops Handbook, Security Visible Ops Handbook (over 180K copies sold)<br />Active researcher<br />Co-founder of IT Process Institute<br />Committee member of Institute of Internal Auditors<br />Leader of PCI Security Standards Council Scoping SIG<br />