@RealGeneKim 
Why Everyone Needs DevOps 
Now: 
My Fifteen Year Journey Studying 
High Performing IT Organizations 
Gene Kim 
Session ID:
IT Operations 
@RealGeneKim
@RealGeneKim
The Product Managers 
@RealGeneKim
The Developers 
@RealGeneKim
@RealGeneKim
@RealGeneKim
@RealGeneKim 
IT Ops And Dev At War 
13
@RealGeneKim
@RealGeneKim
@RealGeneKim 
The Downward 
Spiral…
There Is A Better Way… 
@RealGeneKim
@RealGeneKim 
Google, Amazon, Netflix, 
Spotify, Etsy, Spotify, Twitter, 
Facebook…
@RealGeneKim 
10 deploys per day 
Dev & ops cooperation at Flickr 
John Allspaw & Paul Hammond 
Velocity 2009 
Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
@RealGeneKim
Little bit weird 
Sits closer to the boss 
Thinks too hard 
Pulls levers & turns knobs 
Easily excited 
Yells a lot in emergencies 
Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
Ops who think like devs 
Devs who think like ops 
@RealGeneKim 
Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
@RealGeneKim 
Dev and Ops 
Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
DevOps 
is incomplete, 
is interpreted wrong, 
and is too isolated 
Source: Theo Schlossnagle (@postwait) @RealGeneKim
@RealGeneKim 
.*Ops 
Source: Theo Schlossnagle (@postwait)
^(?<dept>.+)Ops$ 
@RealGeneKim 
Source: Theo Schlossnagle (@postwait)
Source: John Jenkins, Amazon.com @RealGeneKim
@RealGeneKim 
Making Changes When It Matters Most 
“By installing a rampant innovation culture, 
we performed 165 experiments in the peak three 
months of tax season.” 
“Our business result? Conversion rate of the 
website is up 50 percent. Employee result? 
Everyone loves it, because now their ideas can 
make it to market.” 
–Scott Cook, Intuit Founder
@RealGeneKim 
Who Is Doing DevOps? 
 Google, Amazon, Netflix, Etsy, Spotify, Twitter, Facebook … 
 Dynatrace, CSC, IBM, CA, SAP, HP, Microsoft, Red Hat, … 
 GE Capital, Nationwide, BNP Paribas, BNY Mellon, 
World Bank, Paychex, Intuit … 
 The Gap, Nordstrom, Macy’s, Williams-Sonoma, Target … 
 General Motors, Raytheon, LEGO, Bosche … 
 UK Government, US Department of Homeland Security … 
 Kansas State University… 
Who else?
High Performers Are More Agile 
30x 8,000x 
more frequent 
deployments 
@RealGeneKim 
faster lead times 
than their peers 
Source: Puppet Labs 2013 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
@RealGeneKim 
High Performers Are More Reliable 
2x 12x 
the change 
success rate 
faster mean time 
to recover (MTTR) 
Source: Puppet Labs 2013 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
High Performers Win In The Marketplace 
2x 50% 
more likely to 
exceed profitability, 
market share & 
productivity goals 
@RealGeneKim 
higher market 
capitalization growth 
over 3 years* 
Source: Puppet Labs 2014 State Of DevOps
@RealGeneKim 
36 
Source: Darren Hague (@dhague)
“This book will have a profound effect on IT, 
just as The Goal did for manufacturing.” 
–Jez Humble, 
co-author Continuous Delivery 
“This is the IT swamp draining manual for 
anyone who is neck deep in alligators.” 
–Adrian Cockroft, 
Cloud Architect at Netflix 
“This is The Goal for our decade, 
and is for any IT professional who wants 
their life back.” 
–Charles Betz, IT architect, author 
“Architecture and Patterns for IT” 
@RealGeneKim
@RealGeneKim 
The First Way: Flow
@RealGeneKim 
“deploys per day” 
vs. 
“lead time”
@RealGeneKim 
“What is your lead time 
for changes?” 
“How long does it take to go from 
code committed to code successfully 
running in production?”
IT’S A TRAP
@RealGeneKim
@RealGeneKim 
Create One Step Environment 
Creation Process 
 Make environments available early in the 
Development process 
 Make sure Dev builds the code and environment 
at the same time 
 Create a common Dev, QA and Production 
environment creation process
@RealGeneKim 
If I had a magic wand, 
I’d change the Agile sprints and 
definition of “done”: 
“At the end of each sprint, we must 
have working and shippable code… 
demonstrated in an environment 
that resembles production.”
Deploy Smaller Changes, More Frequently * 
@RealGeneKim 
Source: http://www.facebook.com/note.php?note_id=14218138919
Deploy Smaller Changes, More Frequently * 
@RealGeneKim 
 Decouple feature releases from code 
deployments 
 Deploy features in a disabled state, using feature 
flags 
 Require all developers check code into trunk 
daily (at least) 
 Practice deploying smaller changes, which 
dramatically reduces risk and improves MTTR
Experiment: Reducing Batch Size By 50% 
And the customer got the feature in 
@RealGeneKim 
half the time! 
Source: Scott Prugh, Chief Architect, CSG, Inc.
@RealGeneKim 
“As a lifelong Ops practitioner, I know 
we need DevOps to make our work 
humane. 
In the past, I’ve worked every holiday, on 
my birthday, my spouse’s birthday, and 
even on the day my son was born.” 
Nathan Shimek 
Engineering Manager, New Context 
@nathan_shimek
@RealGeneKim 
Breaking The Bottlenecks In The Flow 
 Environment creation 
 Code deployment 
 Test setup and run (mention @rohansingh) 
 Overly tight architecture 
 Development 
 Product management
“In November 2011, running even the most minimal 
test for CloudFoundry required deploying to 45 virtual 
machines, which took a half hour. This was way too 
long, and also prevented developers from testing on 
@RealGeneKim 
their own workstations. 
By using containers, within months, we got it down to 
18 virtual machines so that any developer can deploy 
the entire system to single VM in six minutes.” 
— Elisabeth Hendrickson, Director of Quality 
Engineering, Pivotal Labs 
@testobsessed
@RealGeneKim 
Blackboard Learn: 2005-Present 
54 
LoC 
Commits 
Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman) 
The Problem
@RealGeneKim 
Blackboard Learn Building Blocks 
55 
Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman)
Top Predictors Of IT Performance (2014) 
 Version control of all production artifacts 
 Continuous integration and deployment 
 Automated acceptance testing 
 Peer-review of production changes (vs. external 
change approval) 
 High trust culture 
 Proactive monitoring of the production environment 
 Win-win relationship between Dev and Ops 
@RealGeneKim 
Source: Puppet Labs 2014 State Of DevOps
@RealGeneKim 
The First Way: Outcomes 
 Creating single repository for code and environments 
 Determinism in the release process 
 Consistent Dev, Test and Production environments, all properly 
built before deployment begins 
 Features being deployed daily without catastrophic failures 
 Decreased lead time 
 Faster cycle time and release cadence
@RealGeneKim 
The Second Way: Feedback
@RealGeneKim
How many times per day is the andon cord 
@RealGeneKim 
pulled in a typical day at a Toyota 
manufacturing plant? 
3,500 times per day 
Source: http://www.gembapantarei.com/2008/04/how_many_times_do_you_pull_the_andon_cord_each_day.html
Why would Toyota do something so disruptive as 
stopping production thousands of times per day? 
@RealGeneKim 
“It’s the only way we can build 2,000 vehicles 
per day – that’s one completed vehicle every 
55 seconds.”
@RealGeneKim 
Google Dev And Ops (2013) 
 15,000 engineers, working on 4,000+ projects 
 All code is checked into one source tree 
(billions of files!) 
 5,500 code commits/day 
 75 million test cases are run daily 
"Automated tests transform fear into boredom." 
-- Eran Messeri, Google
@RealGeneKim 
Developers Carry Pagers 
“We found that when we woke up developers at 
2am, defects got fixed faster than ever” 
– Patrick Lightbody, 
CEO, BrowserMob 
“You build it, you run it.” 
– Werner Vogels 
CTO, Amazon
@RealGeneKim 
Developers Carry Pagers 
“As a developer, there has never been a more 
satisfying point in my career than when I wrote 
the code, I pushed the button to deploy it, 
I watched the metrics to see if it actually worked 
in production, and fixed it if it broke.” 
– Tim Tischler 
Director of Operations Engr, 
Nike, Inc.
Devs Initially Self-Manage Their Own Code 
@RealGeneKim 
65 
Source: Tom Limoncelli (@yesthattom)
@RealGeneKim 
Return Fragile Services Back To Dev 
67 
Source: Tom Limoncelli (@yesthattom)
@RealGeneKim 
Pervasive Production Telemetry 
“Having a 
developer add a 
monitoring metric 
shouldn’t feel like 
a schema 
change.” 
– John Allspaw, 
SVP Tech Ops, 
Etsy
@RealGeneKim 
69
@RealGeneKim 
People actually look at the logs! 
(Mention Verizon PCI Data Breach Study) 
70
@RealGeneKim
@RealGeneKim 
One Of The Highest Predictors Of 
Performance
@RealGeneKim 
One Of The Highest Predictors Of 
Performance
Top Predictors Of IT Performance (2014) 
 Version control of all production artifacts 
 Continuous integration and deployment 
 Automated acceptance testing 
 Peer-review of production changes (vs. external 
change approval) 
 High trust culture 
 Proactive monitoring of the production environment 
 Win-win relationship between Dev and Ops 
@RealGeneKim 
Source: Puppet Labs 2014 State Of DevOps
@RealGeneKim 
The Second Way: Outcomes 
 Defects and security issues getting fixed faster than ever 
 Disciplined automated testing enabling many 
simultaneous small, agile teams to work productively 
 All groups communicating and coordinating better 
 Everybody is getting more work done
The Third Way: 
Continual Experimentation And Learning 
@RealGeneKim
@RealGeneKim 
Break Things Early And Often 
“Do painful things more frequently, so you can 
make it less painful… We don’t get pushback 
from Dev, because they know it makes rollouts 
smoother.” 
– Adrian Cockcroft, 
Former Architect, Netflix 
(Now Technology Fellow, 
Battery Ventures)
@RealGeneKim 
80
@RealGeneKim 
Inject Failures Often
@RealGeneKim 
You Don’t Choose Chaos Monkey… 
Chaos Monkey Chooses You
@RealGeneKim 
The 2014 AWS Reboot 
“When we got the news about the emergency EC2 
reboots, our jaws dropped. When we got the list of 
how many Cassandra nodes would be affected, I 
felt ill. 
“Then I remembered all the Chaos Monkey 
exercises we’ve gone through. My reaction 
was, ‘Bring it on!’” 
– Christos Kalantzis 
Netflix Cloud DB Engineering 
Source: http://techblog.netflix.com/2014/10/a-state-of-xen-chaos-monkey-cassandra.html
@RealGeneKim 
The 2014 AWS Reboot 
“Out of our 2700+ production Cassandra nodes, 
218 were rebooted. 22 Cassandra nodes did not 
reboot successfully. 
“Netflix customers experienced no downtime that 
weekend.” 
– Bruce Wong 
Netflix Chaos Engineering
@RealGeneKim 
Allocate 20% Of Cycles To Technical 
Debt Reduction
“By November 2011, Kevin Scott, 
LinkedIn’s top engineer, had had 
enough. The system was taxed as 
LinkedIn attracted more users, and 
engineers were burnt out. 
“To fix the problems, Scott, who’d 
arrived from Google that February, 
launched Operation InVersion. 
“He froze development on new 
features so engineers could overhaul 
the computing architecture. 
“`We had to tell management we’re 
not going to deliver anything new 
while all of engineering works on this 
project for the next two months,’ 
Scott says. “It was a scary thing.’” 
@RealGeneKim
@RealGeneKim
@RealGeneKim
Source: Pingdom
@RealGeneKim 
Why Do I Think This Is 
Important?
@RealGeneKim 
The Downward 
Spiral…
@RealGeneKim
@RealGeneKim 
Opportunity Cost Of 
Wasted IT Spending? 
$2,600,000,000,000.00 per year 
($2.6 Trillion US)
@RealGeneKim 
Our Mission 
Positively influence the 
lives of one million IT 
professionals by 2017.
@RealGeneKim 
DevOps Enterprise: Lessons Learned 
 On Oct 21-23, we held the DevOps Enterprise Summit, a 
conference for horses, by horses 
 Macy’s, Disney, GE Capital, Blackboard, Telstra, US Department of 
Homeland Security, CSG, Raytheon, Ticketmaster, Union Bank of 
California 
 Leaders driving DevOps transformations talked about 
 The business problem they set out to solve 
 The obstacles they had to overcome 
 The business value they created
@RealGeneKim 
Want More Learn More? 
To receive the following: 
 A copy of this presentation 
 A free 140 page excerpt of The Phoenix Project 
 Information on the DevOps Enterprise: Lessons 
Learned 
 My recommended reading list for enterprise DevOps 
adoption 
 See early drafts of our upcoming DevOps Cookbook 
Just pick up your phone, and send an email: 
To: realgenekim@SendYourSlides.com 
Subject: lisa 
realgenekim@SendYourSlides.com 
lisa
Can Large Orgs Be High Performers? 
Yes. 
But orgs with 10,000+ 
employees 40% less likely 
to be high performing vs. 
500 employee orgs… 
Source: Puppet Labs 2014 State Of DevOps @RealGeneKim
@RealGeneKim 
Other Side Of Innovation 
98

Why Everyone Needs DevOps Now: 15 Year Study Of High Performing Technology Orgs

  • 1.
    @RealGeneKim Why EveryoneNeeds DevOps Now: My Fifteen Year Journey Studying High Performing IT Organizations Gene Kim Session ID:
  • 2.
  • 3.
  • 4.
    The Product Managers @RealGeneKim
  • 5.
  • 6.
  • 7.
  • 8.
    @RealGeneKim IT OpsAnd Dev At War 13
  • 9.
  • 10.
  • 11.
  • 12.
    There Is ABetter Way… @RealGeneKim
  • 13.
    @RealGeneKim Google, Amazon,Netflix, Spotify, Etsy, Spotify, Twitter, Facebook…
  • 14.
    @RealGeneKim 10 deploysper day Dev & ops cooperation at Flickr John Allspaw & Paul Hammond Velocity 2009 Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
  • 15.
  • 16.
    Little bit weird Sits closer to the boss Thinks too hard Pulls levers & turns knobs Easily excited Yells a lot in emergencies Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
  • 18.
    Ops who thinklike devs Devs who think like ops @RealGeneKim Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
  • 19.
    @RealGeneKim Dev andOps Source: John Allspaw (@allspaw) and Paul Hammond (@ph)
  • 20.
    DevOps is incomplete, is interpreted wrong, and is too isolated Source: Theo Schlossnagle (@postwait) @RealGeneKim
  • 21.
    @RealGeneKim .*Ops Source:Theo Schlossnagle (@postwait)
  • 22.
    ^(?<dept>.+)Ops$ @RealGeneKim Source:Theo Schlossnagle (@postwait)
  • 23.
    Source: John Jenkins,Amazon.com @RealGeneKim
  • 24.
    @RealGeneKim Making ChangesWhen It Matters Most “By installing a rampant innovation culture, we performed 165 experiments in the peak three months of tax season.” “Our business result? Conversion rate of the website is up 50 percent. Employee result? Everyone loves it, because now their ideas can make it to market.” –Scott Cook, Intuit Founder
  • 25.
    @RealGeneKim Who IsDoing DevOps?  Google, Amazon, Netflix, Etsy, Spotify, Twitter, Facebook …  Dynatrace, CSC, IBM, CA, SAP, HP, Microsoft, Red Hat, …  GE Capital, Nationwide, BNP Paribas, BNY Mellon, World Bank, Paychex, Intuit …  The Gap, Nordstrom, Macy’s, Williams-Sonoma, Target …  General Motors, Raytheon, LEGO, Bosche …  UK Government, US Department of Homeland Security …  Kansas State University… Who else?
  • 26.
    High Performers AreMore Agile 30x 8,000x more frequent deployments @RealGeneKim faster lead times than their peers Source: Puppet Labs 2013 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
  • 27.
    @RealGeneKim High PerformersAre More Reliable 2x 12x the change success rate faster mean time to recover (MTTR) Source: Puppet Labs 2013 State Of DevOps: http://puppetlabs.com/2013-state-of-devops-infographic
  • 28.
    High Performers WinIn The Marketplace 2x 50% more likely to exceed profitability, market share & productivity goals @RealGeneKim higher market capitalization growth over 3 years* Source: Puppet Labs 2014 State Of DevOps
  • 29.
    @RealGeneKim 36 Source:Darren Hague (@dhague)
  • 30.
    “This book willhave a profound effect on IT, just as The Goal did for manufacturing.” –Jez Humble, co-author Continuous Delivery “This is the IT swamp draining manual for anyone who is neck deep in alligators.” –Adrian Cockroft, Cloud Architect at Netflix “This is The Goal for our decade, and is for any IT professional who wants their life back.” –Charles Betz, IT architect, author “Architecture and Patterns for IT” @RealGeneKim
  • 31.
  • 32.
    @RealGeneKim “deploys perday” vs. “lead time”
  • 33.
    @RealGeneKim “What isyour lead time for changes?” “How long does it take to go from code committed to code successfully running in production?”
  • 34.
  • 35.
  • 36.
    @RealGeneKim Create OneStep Environment Creation Process  Make environments available early in the Development process  Make sure Dev builds the code and environment at the same time  Create a common Dev, QA and Production environment creation process
  • 37.
    @RealGeneKim If Ihad a magic wand, I’d change the Agile sprints and definition of “done”: “At the end of each sprint, we must have working and shippable code… demonstrated in an environment that resembles production.”
  • 38.
    Deploy Smaller Changes,More Frequently * @RealGeneKim Source: http://www.facebook.com/note.php?note_id=14218138919
  • 39.
    Deploy Smaller Changes,More Frequently * @RealGeneKim  Decouple feature releases from code deployments  Deploy features in a disabled state, using feature flags  Require all developers check code into trunk daily (at least)  Practice deploying smaller changes, which dramatically reduces risk and improves MTTR
  • 40.
    Experiment: Reducing BatchSize By 50% And the customer got the feature in @RealGeneKim half the time! Source: Scott Prugh, Chief Architect, CSG, Inc.
  • 41.
    @RealGeneKim “As alifelong Ops practitioner, I know we need DevOps to make our work humane. In the past, I’ve worked every holiday, on my birthday, my spouse’s birthday, and even on the day my son was born.” Nathan Shimek Engineering Manager, New Context @nathan_shimek
  • 42.
    @RealGeneKim Breaking TheBottlenecks In The Flow  Environment creation  Code deployment  Test setup and run (mention @rohansingh)  Overly tight architecture  Development  Product management
  • 43.
    “In November 2011,running even the most minimal test for CloudFoundry required deploying to 45 virtual machines, which took a half hour. This was way too long, and also prevented developers from testing on @RealGeneKim their own workstations. By using containers, within months, we got it down to 18 virtual machines so that any developer can deploy the entire system to single VM in six minutes.” — Elisabeth Hendrickson, Director of Quality Engineering, Pivotal Labs @testobsessed
  • 44.
    @RealGeneKim Blackboard Learn:2005-Present 54 LoC Commits Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman) The Problem
  • 45.
    @RealGeneKim Blackboard LearnBuilding Blocks 55 Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman)
  • 46.
    Top Predictors OfIT Performance (2014)  Version control of all production artifacts  Continuous integration and deployment  Automated acceptance testing  Peer-review of production changes (vs. external change approval)  High trust culture  Proactive monitoring of the production environment  Win-win relationship between Dev and Ops @RealGeneKim Source: Puppet Labs 2014 State Of DevOps
  • 47.
    @RealGeneKim The FirstWay: Outcomes  Creating single repository for code and environments  Determinism in the release process  Consistent Dev, Test and Production environments, all properly built before deployment begins  Features being deployed daily without catastrophic failures  Decreased lead time  Faster cycle time and release cadence
  • 48.
  • 49.
  • 50.
    How many timesper day is the andon cord @RealGeneKim pulled in a typical day at a Toyota manufacturing plant? 3,500 times per day Source: http://www.gembapantarei.com/2008/04/how_many_times_do_you_pull_the_andon_cord_each_day.html
  • 51.
    Why would Toyotado something so disruptive as stopping production thousands of times per day? @RealGeneKim “It’s the only way we can build 2,000 vehicles per day – that’s one completed vehicle every 55 seconds.”
  • 52.
    @RealGeneKim Google DevAnd Ops (2013)  15,000 engineers, working on 4,000+ projects  All code is checked into one source tree (billions of files!)  5,500 code commits/day  75 million test cases are run daily "Automated tests transform fear into boredom." -- Eran Messeri, Google
  • 53.
    @RealGeneKim Developers CarryPagers “We found that when we woke up developers at 2am, defects got fixed faster than ever” – Patrick Lightbody, CEO, BrowserMob “You build it, you run it.” – Werner Vogels CTO, Amazon
  • 54.
    @RealGeneKim Developers CarryPagers “As a developer, there has never been a more satisfying point in my career than when I wrote the code, I pushed the button to deploy it, I watched the metrics to see if it actually worked in production, and fixed it if it broke.” – Tim Tischler Director of Operations Engr, Nike, Inc.
  • 55.
    Devs Initially Self-ManageTheir Own Code @RealGeneKim 65 Source: Tom Limoncelli (@yesthattom)
  • 56.
    @RealGeneKim Return FragileServices Back To Dev 67 Source: Tom Limoncelli (@yesthattom)
  • 57.
    @RealGeneKim Pervasive ProductionTelemetry “Having a developer add a monitoring metric shouldn’t feel like a schema change.” – John Allspaw, SVP Tech Ops, Etsy
  • 58.
  • 59.
    @RealGeneKim People actuallylook at the logs! (Mention Verizon PCI Data Breach Study) 70
  • 60.
  • 61.
    @RealGeneKim One OfThe Highest Predictors Of Performance
  • 62.
    @RealGeneKim One OfThe Highest Predictors Of Performance
  • 63.
    Top Predictors OfIT Performance (2014)  Version control of all production artifacts  Continuous integration and deployment  Automated acceptance testing  Peer-review of production changes (vs. external change approval)  High trust culture  Proactive monitoring of the production environment  Win-win relationship between Dev and Ops @RealGeneKim Source: Puppet Labs 2014 State Of DevOps
  • 64.
    @RealGeneKim The SecondWay: Outcomes  Defects and security issues getting fixed faster than ever  Disciplined automated testing enabling many simultaneous small, agile teams to work productively  All groups communicating and coordinating better  Everybody is getting more work done
  • 65.
    The Third Way: Continual Experimentation And Learning @RealGeneKim
  • 66.
    @RealGeneKim Break ThingsEarly And Often “Do painful things more frequently, so you can make it less painful… We don’t get pushback from Dev, because they know it makes rollouts smoother.” – Adrian Cockcroft, Former Architect, Netflix (Now Technology Fellow, Battery Ventures)
  • 67.
  • 68.
  • 69.
    @RealGeneKim You Don’tChoose Chaos Monkey… Chaos Monkey Chooses You
  • 70.
    @RealGeneKim The 2014AWS Reboot “When we got the news about the emergency EC2 reboots, our jaws dropped. When we got the list of how many Cassandra nodes would be affected, I felt ill. “Then I remembered all the Chaos Monkey exercises we’ve gone through. My reaction was, ‘Bring it on!’” – Christos Kalantzis Netflix Cloud DB Engineering Source: http://techblog.netflix.com/2014/10/a-state-of-xen-chaos-monkey-cassandra.html
  • 71.
    @RealGeneKim The 2014AWS Reboot “Out of our 2700+ production Cassandra nodes, 218 were rebooted. 22 Cassandra nodes did not reboot successfully. “Netflix customers experienced no downtime that weekend.” – Bruce Wong Netflix Chaos Engineering
  • 72.
    @RealGeneKim Allocate 20%Of Cycles To Technical Debt Reduction
  • 73.
    “By November 2011,Kevin Scott, LinkedIn’s top engineer, had had enough. The system was taxed as LinkedIn attracted more users, and engineers were burnt out. “To fix the problems, Scott, who’d arrived from Google that February, launched Operation InVersion. “He froze development on new features so engineers could overhaul the computing architecture. “`We had to tell management we’re not going to deliver anything new while all of engineering works on this project for the next two months,’ Scott says. “It was a scary thing.’” @RealGeneKim
  • 74.
  • 75.
  • 76.
  • 77.
    @RealGeneKim Why DoI Think This Is Important?
  • 78.
  • 79.
  • 80.
    @RealGeneKim Opportunity CostOf Wasted IT Spending? $2,600,000,000,000.00 per year ($2.6 Trillion US)
  • 81.
    @RealGeneKim Our Mission Positively influence the lives of one million IT professionals by 2017.
  • 82.
    @RealGeneKim DevOps Enterprise:Lessons Learned  On Oct 21-23, we held the DevOps Enterprise Summit, a conference for horses, by horses  Macy’s, Disney, GE Capital, Blackboard, Telstra, US Department of Homeland Security, CSG, Raytheon, Ticketmaster, Union Bank of California  Leaders driving DevOps transformations talked about  The business problem they set out to solve  The obstacles they had to overcome  The business value they created
  • 83.
    @RealGeneKim Want MoreLearn More? To receive the following:  A copy of this presentation  A free 140 page excerpt of The Phoenix Project  Information on the DevOps Enterprise: Lessons Learned  My recommended reading list for enterprise DevOps adoption  See early drafts of our upcoming DevOps Cookbook Just pick up your phone, and send an email: To: realgenekim@SendYourSlides.com Subject: lisa realgenekim@SendYourSlides.com lisa
  • 84.
    Can Large OrgsBe High Performers? Yes. But orgs with 10,000+ employees 40% less likely to be high performing vs. 500 employee orgs… Source: Puppet Labs 2014 State Of DevOps @RealGeneKim
  • 85.
    @RealGeneKim Other SideOf Innovation 98

Editor's Notes

  • #2 My name is Gene Kim. My area of passion started when I was the CTO and founder of Tripwire in 1999. I started keeping a list that we called “Gene’s list of people with great kung fu.” These were the organizations that simutaneously… In the next 25 minutes, I’m really excited to share with you some of my key learnings, which I’m hoping that will not only be applicable to you, but that you’ll be able to put into practice right away, and get some amazing results. But let me tell you how my journey began…
  • #4 [ picture of messy data center ] Ten minutes into Bill’s first day on the job, he has to deal with a payroll run failure. Tomorrow is payday, and finance just found out that while all the salaried employees are going to get paid, none of the hourly factory employees will. All their records from the factory timekeeping systems were zeroed out. Was it a SAN failure? A database failure? An application failure? Interface failure? Cabling error?
  • #6 Source: http://biobreak.wordpress.com/2010/10/07/games-evangelism-dos-and-donts/
  • #8 Who are they auditing? IT operations. I love IT operatoins. Why? Because when the developers screw up, the only people who can save the day are the IT operations people. Memory leak? No problem, we’ll do hourly reboots until you figure that out. Who here is from IT operations? Bad day: Not as prepared for the audit as they thought Spending 30% of their time scrambling, generating presentation for auditors Or an outage, and the developer is adamant that they didn’t make the change – they’re saying, “it must be the security guys – they’re always causing outages” Or, there’s 50 systems behind the load balancer, and six systems are acting funny – what different, and who made them different Or every server is like a snowflake, each having their own personality We as Tripwire practitioners can help them make sure changes are made visible, authorized, deployed completely and accurately, find differences Create and enforce a culture of change management and causality
  • #9 EG Parts Unlimited, Inc. DBA Parts Unlimited in is serious trouble. Stock has tumbled 19% in the last 30 days, and is down 52% from its peak three years ago. The company continues to be outmaneuvered by their arch-rival, famous for their ability to anticipate and instantly react to customer needs. Parts Unlimited now trails the competition in sales growth, inventory turns and profitability. Parts Unlimited has been promising the release of a software, call “Phoenix” which – if they can ever get it release – should close the gap. It tightly integrates its retailing and e-commerce channels. Already years late, many expect the company to announce another program delay in their analyst earnings call next month. 20 million in, years late and the Board and the Investors are – let’s just say the natives are restless and are looking for heads. Which mean not only have some of the players been let go, and moved positions, but the board is looking at outsourcing and / or splitting up the company.. The board has given the team six months to make dramatic improvements.
  • #10 Source: Flickr: birdsandanchors
  • #11 Who’s introducing variance? Well, it’s often these guys. Show me a developer who isn’t causing an outage, I’ll show you one who is on vacation. Primary measurement is deploy features quickly – get to market. I’ve worked with two of the five largest Internet companies (Google, Microsoft, Yahoo, AOL, Amazon), and I now believe that the biggest differentiator to great time to market is great operations: Bad day: We do 6 weeks of testing, but deployment still fails. Why? QA environment doesn’t match production Or there’s a failure in testing, and no one can agree whether it’s a code failure or an environment failure Or changes are made in QA, but no one wrote them down, so they didn’t get replicated downstream in production Believe it or not, we as Tripwire practitioners can even help them – make sure environments are available when we need them, that they’re properly configured correctly the first time, document all the changes, replicate them downstream
  • #12 [ picture of messy data center ] Ten minutes into Bill’s first day on the job, he has to deal with a payroll run failure. Tomorrow is payday, and finance just found out that while all the salaried employees are going to get paid, none of the hourly factory employees will. All their records from the factory timekeeping systems were zeroed out. Was it a SAN failure? A database failure? An application failure? Interface failure? Cabling error?
  • #14 Source: http://biobreak.wordpress.com/2010/10/07/games-evangelism-dos-and-donts/
  • #15 So who are all these constituencies that we can help, and increase our relevance as Tripwire practitioners and champions? How many people here are in infosec? Goal: protect critical systems and data Safeguard organizational commitments Prevent security breaches, help quickly detect and recover from them Bad day: no security standards No one is complying Yes, we’re 3 years behind. “Whaddya gonna do about it?” Vs. we (Tripwire owner) can become more relevant and add value by help infosec by leveraging all the configuration guidance out there Measure variance between produciton and those known good states Trust and verify that when management says, we’ve trued up the configurations, they’ve actually done it Why? Now, more than ever, there are an ever increasing amount of regulatory and contractual requirements to protect systems and data
  • #23 There are many ways to react to this: like, fear, horror, trying to become invisible… All understandable, given the circumstances… Because infosec can no longer take 4 weeks to turn around a security review for application code, or take 6 weeks to turnaround a firewall change. But, on the other hand, I think it’s will be the best thing to ever happen to infosec in the past 20 years. We’re calling this Rugged DevOps, because it’s a way for infosec to integrate into the DevOps process, and be welcomed. And not be viewed as the shrill hysterical folks who slow the business down.
  • #31 Tell story of Amazon, Netflix: they care about, availability, security It’s not a push, it’s a pull – they’re looking for our help (#1 concern: fear of disintermediation and being marginalized)
  • #66 Eran Feigenbaum Director of Security, Google Enterprise
  • #88 [ picture of messy data center ] Ten minutes into Bill’s first day on the job, he has to deal with a payroll run failure. Tomorrow is payday, and finance just found out that while all the salaried employees are going to get paid, none of the hourly factory employees will. All their records from the factory timekeeping systems were zeroed out. Was it a SAN failure? A database failure? An application failure? Interface failure? Cabling error?