@RealGeneKim
Session ID:
Gene Kim
genek@realgenekim.me
Leading A DevOps Transformation
@RealGeneKim
Dr. Steve Spear
5
@RealGeneKim
6
@RealGeneKim
Dr. Steven Spear
@RealGeneKim
“As tempting as it seems, you cannot
reorganize your way to continuous
improvement and adaptiveness. What is
decisive is not the form of the
organization, but how people act and
react.
“The roots of Toyota’s success lie not in
its organizational structures, but in
developing capability and habits in its
people. It surprises many people, in fact,
to find that Toyota is largely organized in
a traditional, functional-department style.”
– Mike Rother
@RealGeneKim
Dr. Steven Spear
 “While designing perfectly safe systems is likely
beyond our abilities, safe systems are close to
achievable” when the following conditions are
met…
@RealGeneKim
Capability 1
 See problems as they occur:
 Complex work is managed so that problems in design
are revealed
 They see problems as they occur, through relentless
testing of assumptions
Automated testing in the deployment pipeline,
proactive monitoring of the production environment, …
Source: Dr. Steven Spear
@RealGeneKim
Capability 2
 Swarming and solving problems as they are seen
to build new knowledge
 Problems that are seen are solved so that new
knowledge is built quickly
 Improvement of daily work is prioritized above daily
work
Stopping work when builds, tests, deployments and services break,
enabling fast feedback loops, especially to Dev…
Source: Dr. Steven Spear
@RealGeneKim
Capability 3
 Spreading new knowledge throughout the
organization
 The new discovery of local knowledge and
improvements are turned into global improvements,
shared throughout the organization
 Learning is fed back into the system to prevent future
failures
High trust culture, blameless post-mortems when things go wrong,
single source code repositories enterprise-wide, …
Source: Dr. Steven Spear
@RealGeneKim
Capability 4
 Leading by developing
 The job of leaders is not to command and control, but
to create other capable leaders who can perpetuate this
system of work
Source: Dr. Steven Spear
Encouraging experimentation and learning, coaching, removing
obstacles, enabling
@RealGeneKim
“Culture isn’t just touchy-feely kumbahyah. Instead,
it is the consistent response by a group of people
to conditions. When we change culture, we
fundamentally shift how people respond to a
situation.
– Dr. Steven Spear
@RealGeneKim
“The most effective way is for senior leaders to
change the conversation from ‘did you carry your
orders out?’ to ‘what did you learn today?’ ”
– Dr. Steven Spear
@RealGeneKim
The “Big Bang” Transformation Dream
Start
Finish
Source: Damon Edwards (@damonedwards)
@RealGeneKim
The “Big Bang” Transformation Reality
Start
Finish
Fear
Panic
Abort
Maybe
People revert to
legacy behaviors
Source: Damon Edwards (@damonedwards)
@RealGeneKim
“Big J” vs “Little J’s”
Start
Finish
Start
Finish
Big Bang Continuous Improvement
Source: Damon Edwards (@damonedwards)
@RealGeneKim
Other Side Of Innovation
19
@RealGeneKim
Breaking The Bottlenecks In The Flow
 Environment creation
 Code deployment
 Test setup and run (mention @rohansingh)
 Overly tight architecture
 Development
 Product management
@RealGeneKim
Blackboard Learn: 2005-Present
23
Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman)
LoC
Commits
The Problem
@RealGeneKim
Blackboard Learn Building Blocks
24
Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman)
@RealGeneKim
Target
“stopping changes makes it
worse”
“still working out how to apply
this to legacy”
“still challenged to scale across
thousands of people”
 Make structural
changes
 Modernise technology
 Connect important dots
 Build an internal
incubator
 Develop learning
service offerings
 Prioritise demand
based on constraints
 Six internal DevOps
conferneces
instead of waiting 3-6
months an individual
can build a full stack
automatically
200 trained
in DevOps
Source: Rob England (@theitskeptic)
@RealGeneKim
Chivas Nambiar, Verizon
@RealGeneKim
CSG bill printing
40 dev teams, 1000 staff
A release has been practiced 70
times
Phoenix servers not snowflakes
 Improve work visibility
 Single intake of work:
dev, ops, requests
 Go see, and role rotation
 Change behaviour to
change culture
 Legacy test automation
 Strangler pattern
 Telemetry and shared
understanding
Regression tests
went from 20% to
5% of effort
Incidents per release
2013: 201
2015: 18
Source: Rob England (@theitskeptic)
CSG International Confidential and Proprietary Information Copyright © 2015 CSG Systems International, Inc. and/or its affiliates (“CSG International”). All rights reserved.
1b. Dependency Visibility
28
Make your team and system dependencies visible. Leverage this to increase
understanding, unwind handoffs and move towards feature teams.
41 Teams
7Iterations
“Conway’s Board”
http://www.scaledagileframework.com/release-planning
@RealGeneKim
Michael Hrenko, Blue Shield of CA
@RealGeneKim
Nordstrom
“stopped optimizing for
cost, started optimizing for
speed”
“in 2015, 20% lead time
reduction target across
the board for customer
facing properties”
 Goal: make cycle
time visible
 Created internal
Kata coaches and
trainers to help
internal teams
 Experimenting with
microservices for e-
commerceCosmetic Business
Office lead time: 7d to
”nearly real time”
Source: Rob England (@theitskeptic)
@RealGeneKim
TicketMaster
73 dev teams, 100% push
their own code
3 days DevOps training =
access to Prod
Metal-to-money
deployment, no hands
Ego is a forcefield against
learning
Blocked is an
unacceptable state
 Breaking bread
together
 Breath customer
air
 4 in the box: mgr,
ops, UX, process
 Dev teams on call
 Metrics: outcomes
over outputs
98% reduction
in MTTR
Source: Rob England (@theitskeptic)
@RealGeneKim
USAA military insurance
If you have to rely on
heroics your process is
broken
It is requirements and
testing that take the time
 Elevator pitch:
aspirational, same
page
 Have an internal
brand
 Daily regression
test runs overnight
 Leading indicators
on a dashboard
Release 28 days
-> 7 days
with 40 years of legacy
Source: Rob England (@theitskeptic)
@RealGeneKim
Sherwin-Williams
Never mind the
technology: you need a
salesman and a
politician
 Using SAFe
 Maturity model
 Code
 Environments
 Data
 Tests
 Process
 Use value mapping to
find the pain points
46,000 code
deploys a year
Provision an
Oracle server
in 15 minutes
Source: Rob England (@theitskeptic)
@RealGeneKim
ING
TiTo today in today out: go home with
a clean slate
Agile can learn from ITIL and ITIL can
learn from Agile
 Eliminate duplicate admin,
make ITIL as lean as
possible
 Reserve 30% of sprint
capacity for incidents
 Problem management
stories as backlog
 Minimise tech debt:
ThisSprintInNextSprintOut
 Daily CAB
 Permission to change from
other team members
500 app
teams doing
DevOps
Source: Rob England (@theitskeptic)
@RealGeneKim
Capital One
Its never going to be
perfect; its only going to
get better.
All new software must
justify why not open
source.
Dev, QA and some prod
on public cloud
 Started with
automated builds
for one team
 Developers are
customers of the
toolmakers
 All code peer
reviewed before
merge to trunk
Building a server
60 days $25k
-> on demand
Internal DevOps
conference 1200
attendees
Source: Rob England (@theitskeptic)
@RealGeneKim
HP
DevOps is a parallel
mode for us.
The war is over: the
source control tool is git.
Trust but verify.
Minimum viable
process.
 CIO in the room
 ChatOps
integration
 Lightweight peer
reviews
 Collaboration
without playing
with org charts
 Vertical
One change =
one deploy
Source: Rob England (@theitskeptic)
@RealGeneKim
IBM
JAT tool built on ANT to
test CICS
RD&T tool on Intel/Linux
 Mainframe test
automation
 Refactoring to
callable services
 Recompile
optimisationVISANET has
been up for
19 years
Dev time
reduced by 90%
Testing
from weeks
to hours
Source: Rob England (@theitskeptic)
@RealGeneKim
Google
2013: 15k Devs, 4k
projects
The biggest obstacle is
how we see the world
 One version of code
 Open repository
75 million test
cases a day
5,500 code
commits a day
@RealGeneKim
CSC
If you ask people to
change they don’t go
straight to awesome,
first they get worse.
You have to practice.
 Measure baseline
 Visualise system
of work
 Identify waste
 Change the
bureaucracy
 Measure
improvement
Deployments
12 hours
-> 12 minutes
Source: Rob England (@theitskeptic)

Leading A DevOps Transformation: Lessons Learned

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
    @RealGeneKim “As tempting asit seems, you cannot reorganize your way to continuous improvement and adaptiveness. What is decisive is not the form of the organization, but how people act and react. “The roots of Toyota’s success lie not in its organizational structures, but in developing capability and habits in its people. It surprises many people, in fact, to find that Toyota is largely organized in a traditional, functional-department style.” – Mike Rother
  • 6.
    @RealGeneKim Dr. Steven Spear “While designing perfectly safe systems is likely beyond our abilities, safe systems are close to achievable” when the following conditions are met…
  • 7.
    @RealGeneKim Capability 1  Seeproblems as they occur:  Complex work is managed so that problems in design are revealed  They see problems as they occur, through relentless testing of assumptions Automated testing in the deployment pipeline, proactive monitoring of the production environment, … Source: Dr. Steven Spear
  • 8.
    @RealGeneKim Capability 2  Swarmingand solving problems as they are seen to build new knowledge  Problems that are seen are solved so that new knowledge is built quickly  Improvement of daily work is prioritized above daily work Stopping work when builds, tests, deployments and services break, enabling fast feedback loops, especially to Dev… Source: Dr. Steven Spear
  • 9.
    @RealGeneKim Capability 3  Spreadingnew knowledge throughout the organization  The new discovery of local knowledge and improvements are turned into global improvements, shared throughout the organization  Learning is fed back into the system to prevent future failures High trust culture, blameless post-mortems when things go wrong, single source code repositories enterprise-wide, … Source: Dr. Steven Spear
  • 10.
    @RealGeneKim Capability 4  Leadingby developing  The job of leaders is not to command and control, but to create other capable leaders who can perpetuate this system of work Source: Dr. Steven Spear Encouraging experimentation and learning, coaching, removing obstacles, enabling
  • 11.
    @RealGeneKim “Culture isn’t justtouchy-feely kumbahyah. Instead, it is the consistent response by a group of people to conditions. When we change culture, we fundamentally shift how people respond to a situation. – Dr. Steven Spear
  • 12.
    @RealGeneKim “The most effectiveway is for senior leaders to change the conversation from ‘did you carry your orders out?’ to ‘what did you learn today?’ ” – Dr. Steven Spear
  • 13.
    @RealGeneKim The “Big Bang”Transformation Dream Start Finish Source: Damon Edwards (@damonedwards)
  • 14.
    @RealGeneKim The “Big Bang”Transformation Reality Start Finish Fear Panic Abort Maybe People revert to legacy behaviors Source: Damon Edwards (@damonedwards)
  • 15.
    @RealGeneKim “Big J” vs“Little J’s” Start Finish Start Finish Big Bang Continuous Improvement Source: Damon Edwards (@damonedwards)
  • 16.
  • 17.
    @RealGeneKim Breaking The BottlenecksIn The Flow  Environment creation  Code deployment  Test setup and run (mention @rohansingh)  Overly tight architecture  Development  Product management
  • 18.
    @RealGeneKim Blackboard Learn: 2005-Present 23 Source:David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman) LoC Commits The Problem
  • 19.
    @RealGeneKim Blackboard Learn BuildingBlocks 24 Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman)
  • 20.
    @RealGeneKim Target “stopping changes makesit worse” “still working out how to apply this to legacy” “still challenged to scale across thousands of people”  Make structural changes  Modernise technology  Connect important dots  Build an internal incubator  Develop learning service offerings  Prioritise demand based on constraints  Six internal DevOps conferneces instead of waiting 3-6 months an individual can build a full stack automatically 200 trained in DevOps Source: Rob England (@theitskeptic)
  • 21.
  • 22.
    @RealGeneKim CSG bill printing 40dev teams, 1000 staff A release has been practiced 70 times Phoenix servers not snowflakes  Improve work visibility  Single intake of work: dev, ops, requests  Go see, and role rotation  Change behaviour to change culture  Legacy test automation  Strangler pattern  Telemetry and shared understanding Regression tests went from 20% to 5% of effort Incidents per release 2013: 201 2015: 18 Source: Rob England (@theitskeptic)
  • 23.
    CSG International Confidentialand Proprietary Information Copyright © 2015 CSG Systems International, Inc. and/or its affiliates (“CSG International”). All rights reserved. 1b. Dependency Visibility 28 Make your team and system dependencies visible. Leverage this to increase understanding, unwind handoffs and move towards feature teams. 41 Teams 7Iterations “Conway’s Board” http://www.scaledagileframework.com/release-planning
  • 24.
  • 25.
    @RealGeneKim Nordstrom “stopped optimizing for cost,started optimizing for speed” “in 2015, 20% lead time reduction target across the board for customer facing properties”  Goal: make cycle time visible  Created internal Kata coaches and trainers to help internal teams  Experimenting with microservices for e- commerceCosmetic Business Office lead time: 7d to ”nearly real time” Source: Rob England (@theitskeptic)
  • 26.
    @RealGeneKim TicketMaster 73 dev teams,100% push their own code 3 days DevOps training = access to Prod Metal-to-money deployment, no hands Ego is a forcefield against learning Blocked is an unacceptable state  Breaking bread together  Breath customer air  4 in the box: mgr, ops, UX, process  Dev teams on call  Metrics: outcomes over outputs 98% reduction in MTTR Source: Rob England (@theitskeptic)
  • 27.
    @RealGeneKim USAA military insurance Ifyou have to rely on heroics your process is broken It is requirements and testing that take the time  Elevator pitch: aspirational, same page  Have an internal brand  Daily regression test runs overnight  Leading indicators on a dashboard Release 28 days -> 7 days with 40 years of legacy Source: Rob England (@theitskeptic)
  • 28.
    @RealGeneKim Sherwin-Williams Never mind the technology:you need a salesman and a politician  Using SAFe  Maturity model  Code  Environments  Data  Tests  Process  Use value mapping to find the pain points 46,000 code deploys a year Provision an Oracle server in 15 minutes Source: Rob England (@theitskeptic)
  • 29.
    @RealGeneKim ING TiTo today intoday out: go home with a clean slate Agile can learn from ITIL and ITIL can learn from Agile  Eliminate duplicate admin, make ITIL as lean as possible  Reserve 30% of sprint capacity for incidents  Problem management stories as backlog  Minimise tech debt: ThisSprintInNextSprintOut  Daily CAB  Permission to change from other team members 500 app teams doing DevOps Source: Rob England (@theitskeptic)
  • 30.
    @RealGeneKim Capital One Its nevergoing to be perfect; its only going to get better. All new software must justify why not open source. Dev, QA and some prod on public cloud  Started with automated builds for one team  Developers are customers of the toolmakers  All code peer reviewed before merge to trunk Building a server 60 days $25k -> on demand Internal DevOps conference 1200 attendees Source: Rob England (@theitskeptic)
  • 31.
    @RealGeneKim HP DevOps is aparallel mode for us. The war is over: the source control tool is git. Trust but verify. Minimum viable process.  CIO in the room  ChatOps integration  Lightweight peer reviews  Collaboration without playing with org charts  Vertical One change = one deploy Source: Rob England (@theitskeptic)
  • 32.
    @RealGeneKim IBM JAT tool builton ANT to test CICS RD&T tool on Intel/Linux  Mainframe test automation  Refactoring to callable services  Recompile optimisationVISANET has been up for 19 years Dev time reduced by 90% Testing from weeks to hours Source: Rob England (@theitskeptic)
  • 33.
    @RealGeneKim Google 2013: 15k Devs,4k projects The biggest obstacle is how we see the world  One version of code  Open repository 75 million test cases a day 5,500 code commits a day
  • 34.
    @RealGeneKim CSC If you askpeople to change they don’t go straight to awesome, first they get worse. You have to practice.  Measure baseline  Visualise system of work  Identify waste  Change the bureaucracy  Measure improvement Deployments 12 hours -> 12 minutes Source: Rob England (@theitskeptic)