This document discusses the benefits of adopting DevOps practices. It notes that wasted IT spending amounts to $2.6 trillion per year and that traditional divisions between development and operations hamper business goals. Adopting DevOps allows for faster delivery of code changes, more reliable systems through better feedback, and an organizational culture of continual learning through experimentation. Companies that have implemented DevOps see benefits like 30x more frequent deployments, 8,000x faster lead times, and higher success rates and availability. The document advocates that all organizations can achieve these gains through DevOps.
23. Making Changes When It Matters Most
―By installing a rampant innovation culture, we performed 165
experiments in the peak three months of tax season.‖
―Our business result? Conversion rate of the website is up 50
percent. Employee result? Everyone loves it, because now their
ideas can make it to market.‖
–Scott Cook, Intuit Founder
23
@RealGeneKim
24. Who Is Doing DevOps?
Google, Amazon, Netflix, Etsy, Spotify, Twitter, Facebook…
BNP Paribas, BNY Mellon, World Bank, Paychex, Intuit…
The Gap, Nordstrom, Macy’s, Williams-Sonoma, Target …
SAP, HP, General Motors, Northrup Grumman …
UK Government, Kansas State University…
Who else?
24
@RealGeneKim
25. High Performing DevOps Teams
They’re more agile
30x more frequent deployments
8,000x faster lead time than their peers
They’re more reliable
2x the change success rate
12x faster MTTR
Source: Puppet Labs 2012 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
@RealGeneKim
27. ―This book will have a profound
effect on IT, just as The Goal did
for manufacturing.‖ –Jez
Humble, co-author Continuous
Delivery
―This is the IT swamp draining
manual for anyone who is neck
deep in alligators.‖ –Adrian
Cockroft, Cloud Architect at
Netflix
“This is The Goal for our decade,
and is for any IT professional who
wants their life back.‖ –Charles
Betz, IT architect, author
“Architecture and Patterns for
IT”
27
@RealGeneKim
30. ―What is your lead time for changes?‖
―How long does it take to go from
code committed to code successfully
running in production?‖
30
@RealGeneKim
33. Create One Step Environment Creation Process
Make environments available early in the Development process
Make sure Dev builds the code and environment at the same time
Create a common Dev, QA and Production environment creation
process
@RealGeneKim
34. If I had a magic wand, I’d change the Agile sprints and definition
of ―done‖:
―At the end of each sprint, we must have working and shippable
code, demonstrated in an environment that resembles production.‖
@RealGeneKim
35. The First Way: Outcomes
Creating single repository for code and environments
Determinism in the release process
Consistent Dev, Test and Production environments, all properly built before
deployment begins
Features being deployed daily without catastrophic failures
Decreased lead time
Faster cycle time and release cadence
Technologies needed: configuration management, provisioning, automated
testing
@RealGeneKim
38. How many times per day is the andon cord pulled in a typical day at
a Toyota manufacturing plant?
3500 times per day
@RealGeneKim
39. Why would Toyota do something so disruptive as stopping
production thousands of times per day?
―It’s the only way we can build 2,000 vehicles per day – that’s one
completed vehicle every 55 seconds.‖
@RealGeneKim
40. Google Dev And Ops (2013)
15,000 engineers, working on 4,000+ projects
All code is checked into one source tree (billions of files!)
5500 code commits/day
75 million test cases are run daily
"Automated tests transform fear into boredom."
-- Eran Messeri, Google
40
@RealGeneKim
41. Developers Carry Pagers
―We found that when we woke up developers at 2am, defects got
fixed faster than ever‖
– Patrick Lightbody,
CEO, BrowserMob
―You build it, you run it.‖
– Werner Vogels
CTO, Amazon
@RealGeneKim
42. The Second Way:
Outcomes
Defects and security issues getting fixed faster than ever
Disciplined automated testing enabling many simultaneous small,
agile teams to work productively
All groups communicating and coordinating better
Everybody is getting more work done
Technologies needed: automated regression testing, static code
analysis, production monitoring
@RealGeneKim
44. Break Things Early And Often
―Do painful things more frequently, so you can make it less painful…
We don’t get pushback from Dev, because they know it makes
rollouts smoother.‖
– Adrian Cockcroft, Architect, Netflix
@RealGeneKim
47. You Don’t Choose Chaos Monkey…
Chaos Monkey Chooses You
@RealGeneKim
48. The Third Way:
Outcomes
A culture that values learning
A culture of fearless improvement (as opposed to a culture of fear)
Development, Test and IT Operations is enabling organization to
out-innovate the competition and help the business win
Technologies needed: great production monitoring
@RealGeneKim
53. Objections You May Hear
―This is only for the unicorns. We’re not Google or Amazon or
Spotify.‖
―The IT Operations monitoring market is 20 years old. All the
product sales have already been sold. There’s no new
opportunities out there.‖
―All these problems are process and cultural issues. That’s what I
need to fix, not implement tools.‖
53
@RealGeneKim
54. If I Could Wave A Magic Wand, Everyone Will…
Have belief and confidence that you can show prospects their
own downward spiral stories that will resonate, and resonate at
the highest levels
Be able to challenge prospects to have their own ―aha‖ moment
and be able to help them start their own transformations
Be able to help your customers automate their processes, not just
to increase availability, but to help them enable innovation
Help your customers win in the marketplace, free them from
tedium and suffering, and achieve their highest and best potential
as fellow human beings
54
@RealGeneKim
55. Our Mission: Positively Impact The Lives Of One Million IT
Professionals By 2017
―Some books you give to friends,
for the joy of sharing a great
novel.
―Some books you recommend to
your colleagues and employees,
to findcommon ground.
―Some books you share with your
boss, to plant the seeds of a big
idea.
―The Phoenix Project is all three.‖
Free 170 page excerpt:
http://itrevolution.com/the-phoenix-project-excerpt/
–Jeremiah Shirk, DevOps
Leader, Kansas State
University
55
@RealGeneKim
Editor's Notes
Who are they auditing? IT operations.I love IT operatoins. Why? Because when the developers screw up, the only people who can save the day are the IT operations people. Memory leak? No problem, we’ll do hourly reboots until you figure that out.Who here is from IT operations?Bad day:Not as prepared for the audit as they thoughtSpending 30% of their time scrambling, generating presentation for auditorsOr an outage, and the developer is adamant that they didn’t make the change – they’re saying, “it must be the security guys – they’re always causing outages”Or, there’s 50 systems behind the load balancer, and six systems are acting funny – what different, and who made them differentOr every server is like a snowflake, each having their own personalityWe as Tripwire practitioners can help them make sure changes are made visible, authorized, deployed completely and accurately, find differencesCreate and enforce a culture of change management and causality
Source: Flickr: birdsandanchors
Who’s introducing variance? Well, it’s often these guys. Show me a developer who isn’t causing an outage, I’ll show you one who is on vacation.Primary measurement is deploy features quickly – get to market.I’ve worked with two of the five largest Internet companies (Google, Microsoft, Yahoo, AOL, Amazon), and I now believe that the biggest differentiator to great time to market is great operations:Bad day: We do 6 weeks of testing, but deployment still fails. Why? QA environment doesn’t match productionOr there’s a failure in testing, and no one can agree whether it’s a code failure or an environment failureOr changes are made in QA, but no one wrote them down, so they didn’t get replicated downstream in productionBelieve it or not, we as Tripwire practitioners can even help them – make sure environments are available when we need them, that they’re properly configured correctly the first time, document all the changes, replicate them downstream
[ picture of messy data center ] Ten minutes into Bill’s first day on the job, he has to deal with a payroll run failure. Tomorrow is payday, and finance just found out that while all the salaried employees are going to get paid, none of the hourly factory employees will. All their records from the factory timekeeping systems were zeroed out.Was it a SAN failure? A database failure? An application failure? Interface failure? Cabling error?
So who are all these constituencies that we can help, and increase our relevance as Tripwire practitioners and champions?How many people here are in infosec?Goal: protect critical systems and dataSafeguard organizational commitmentsPrevent security breaches, help quickly detect and recover from themBad day: no security standardsNo one is complyingYes, we’re 3 years behind. “Whaddyagonna do about it?”Vs. we (Tripwire owner) can become more relevant and add value by help infosec by leveraging all the configuration guidance out thereMeasure variance between produciton and those known good statesTrust and verify that when management says, we’ve trued up the configurations, they’ve actually done itWhy? Now, more than ever, there are an ever increasing amount of regulatory and contractual requirements to protect systems and data
There are many ways to react to this: like, fear, horror, trying to become invisible… All understandable, given the circumstances…Because infosec can no longer take 4 weeks to turn around a security review for application code, or take 6 weeks to turnaround a firewall change. But, on the other hand, I think it’s will be the best thing to ever happen to infosec in the past 20 years. We’re calling this Rugged DevOps, because it’s a way for infosec to integrate into the DevOps process, and be welcomed. And not be viewed as the shrill hysterical folks who slow the business down.
Tell story of Amazon, Netflix: they care about, availability, securityIt’s not a push, it’s a pull – they’re looking for our help (#1 concern: fear of disintermediation and being marginalized)