Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
How we used Kanban in 
Operations to get things done 
@dominicad 
www.ddegrandis.com
40 Ops Engineers (SysAdmin, DBA, Network, Mon, Sec) 
Tasked to build out/retrofit 6 data-centers 
across 6 diff countries....
What randomizes yo 
ur day? 
• Conflicting priorities 
• Changing priorities 
• Interruptions 
• Context switching 
• Unab...
Sources of customer dissatisfaction: 
• No visibility into status of request 
• Things take too long 
@dominicad 
www.ddeg...
3 
ques'ons: 
1. 
What 
is 
the 
actual 
demand? 
2. 
What 
is 
the 
Lead 
'me/cycle 
'me? 
3. 
Where 
are 
the 
constrain...
How 
we 
got 
started 
on 
road 
to 
improving 
– 
looked 
at 
3 
data 
pts 
#1 Can we keep up with the demand? 
open 
clo...
Top 5 reasons why people take on 
more work than they have capacity 
to do: 
1. Hard to say no to people we like 
2. Don’t...
#2 Lead time – how long does it take 
60 
50 
40 
30 
20 
10 
0 
to get work done? 
21% of Work done > 120 days 
1 
2 
3 
...
#3 
Where 
is 
work 
stuck? 
prep 
doing 
On Deck Implement Validate Closed 
Project 
work 
Maintenance 
work 
prep 
done ...
Customer 
Mtg 
'me 
Invited 
customers 
to 
a 
mee'ng 
and 
showed 
them 
the 
data. 
– showed 
them 
the 
demand 
and 
wh...
Next 
steps 
The 
introduc'on 
of 
a 
work-­‐in-­‐progress 
(wip) 
limit. 
Some 
of 
these 
guys 
had 
20 
– 
40 
'ckets 
...
and 
then 
there 
were 
reorgs 
The 
1st 
org 
restructure 
change 
created 
the 
A 
Team 
-­‐ 
to 
focus 
on 
comple'ng 
...
Live Ops tasks 
• access requests for systems, non-Zabbix monitor 
• hardware investigation/verification/fixes 
- vlan/por...
Physical 
board 
reflected 
'ckets 
in 
electronic 
tool
Clear distinction between prioritized work, 
and capacity to handle the work yet.
Physical 
board 
reflected 
'ckets 
in 
electronic 
tool 
Clear definitions of done between queues è
Live 
Ops 
SRE 
Changes 
• Socialized wip limit idea over 6 months and 
gradually lowered it from 10 to 7 – out of 18 
guy...
Hi D.C., 
Team SRE has a very large number of 
changes scheduled for today already, 
and an even larger number of requests...
Live 
Ops 
SRE 
changes 
con’t 
• Took time during standups to focus on kaizen 
improvements. 
• Reduced validate state fr...
“Asking this much 
of people, even 
when they wanted 
to give it, was not 
acceptable.” 
-­‐ 
Ed 
Catmull
Here’s what we need help with: 
For the leaders 
Consider the power you have 
over other people when you ask 
something of...
Here’s what we need help with: 
For the workers – How to make it ok to 
believe that … 
“No” is an honorable reply 
to som...
Improve collaboratively using models
Workflow Optimization using Kanban 
www.ddegrandis.com 
dominica@ddegrandis.com 
@dominicad
Upcoming SlideShare
Loading in …5
×

DOES14 - Dominica Degrandis - How we used Kanban in Operations to Get Things Done

3,771 views

Published on

Dominica DeGrandis, Kanban for DevOps Trainer at DevOps Enterprise Summit 2014

Video: https://www.youtube.com/watch?v=coRx-onQ09Y

Published in: Leadership & Management
  • Be the first to comment

DOES14 - Dominica Degrandis - How we used Kanban in Operations to Get Things Done

  1. 1. How we used Kanban in Operations to get things done @dominicad www.ddegrandis.com
  2. 2. 40 Ops Engineers (SysAdmin, DBA, Network, Mon, Sec) Tasked to build out/retrofit 6 data-centers across 6 diff countries. AND…. • keep the lights on 4 existing data centers • build out a new platform architecture • support live issues (on-call) • roll out a new configuration management tool • Deploy new features • deal with 3 org structure changes over a 6 month period @dominicad www.ddegrandis.com
  3. 3. What randomizes yo ur day? • Conflicting priorities • Changing priorities • Interruptions • Context switching • Unable to meet commitments @dominicad www.ddegrandis.com
  4. 4. Sources of customer dissatisfaction: • No visibility into status of request • Things take too long @dominicad www.ddegrandis.com
  5. 5. 3 ques'ons: 1. What is the actual demand? 2. What is the Lead 'me/cycle 'me? 3. Where are the constraints in the pipeline? @dominicad www.ddegrandis.com
  6. 6. How we got started on road to improving – looked at 3 data pts #1 Can we keep up with the demand? open closed
  7. 7. Top 5 reasons why people take on more work than they have capacity to do: 1. Hard to say no to people we like 2. Don’t want to let the team down 3. Didn’t realize work would take so long 4. Fear from those in position of power 5. People pleaser . @dominicad www.ddegrandis.com John Townsend, "Boundaries” , women do more people-­‐pleasing in rela1onships, men more likely to say yes to tasks.
  8. 8. #2 Lead time – how long does it take 60 50 40 30 20 10 0 to get work done? 21% of Work done > 120 days 1 2 3 4 5 6 7 8 9 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 250 300 More # #ckets # days it took for ticket to go from created to closed The Flaw of Averages
  9. 9. #3 Where is work stuck? prep doing On Deck Implement Validate Closed Project work Maintenance work prep done @dominicad www.ddegrandis.com
  10. 10. Customer Mtg 'me Invited customers to a mee'ng and showed them the data. – showed them the demand and what was geWng done – showed them the Lead 'me – showed where work got stuck Customers appreciated the visibility into Ops We took advantage of that by humbly asking for their help. beginning with all of those 'ckets siWng in the validate state. @dominicad www.ddegrandis.com
  11. 11. Next steps The introduc'on of a work-­‐in-­‐progress (wip) limit. Some of these guys had 20 – 40 'ckets in their queue. We asked them, “Does this seem reasonable?” How about 10? Let’s head in that direc'on and see what happens. @dominicad www.ddegrandis.com
  12. 12. and then there were reorgs The 1st org restructure change created the A Team -­‐ to focus on comple'ng projects close to being done, but s'll hanging on. • This team didn’t have to respond to one off requests and wasn’t supposed to be on-­‐call. 2nd org structure change split Ops into 3 teams (live Ops, Build, architecture) • Live-­‐Ops with 25% of team and 60% of the work! @dominicad www.ddegrandis.com
  13. 13. Live Ops tasks • access requests for systems, non-Zabbix monitor • hardware investigation/verification/fixes - vlan/port changes - data retrieval (i.e. logs, network stats, etc) • configuration triage - firewalls, load balancers, OS settings • capacity expansion • verification of configs/services across shards • database development consultation • security compliance mitigation @dominicad www.ddegrandis.com
  14. 14. Physical board reflected 'ckets in electronic tool
  15. 15. Clear distinction between prioritized work, and capacity to handle the work yet.
  16. 16. Physical board reflected 'ckets in electronic tool Clear definitions of done between queues è
  17. 17. Live Ops SRE Changes • Socialized wip limit idea over 6 months and gradually lowered it from 10 to 7 – out of 18 guys, average is 5-7. • Closed out all tickets with no activity > 90 days • Started saying “No” to last minute requests. • Hired 2 new people
  18. 18. Hi D.C., Team SRE has a very large number of changes scheduled for today already, and an even larger number of requests in our backlog that this request will displace if moved to the front of the queue. It would not be fair to other teams if we jumped on this immediately while planned work is pushed off. Monitoring should be a requirement for a service to go live, not a last minute addition. For us to fully support a live service, please implement monitoring before going live. For future requests, please give us as much notice as possible, and make sure to create a ticket (xxx.com) so we can prioritize and schedule the changes as necessary. Here's the ticket for this work…. Respectfully, A.H
  19. 19. Live Ops SRE changes con’t • Took time during standups to focus on kaizen improvements. • Reduced validate state from 7 to 5 to 3 days. • Found creative ways to deal with walkups, and work done via personal relationships • 15 min daily sync up at 3pm instead of interrupting.
  20. 20. “Asking this much of people, even when they wanted to give it, was not acceptable.” -­‐ Ed Catmull
  21. 21. Here’s what we need help with: For the leaders Consider the power you have over other people when you ask something of them. @dominicad www.ddegrandis.com
  22. 22. Here’s what we need help with: For the workers – How to make it ok to believe that … “No” is an honorable reply to someone asking too much from you. @dominicad www.ddegrandis.com
  23. 23. Improve collaboratively using models
  24. 24. Workflow Optimization using Kanban www.ddegrandis.com dominica@ddegrandis.com @dominicad

×