Without Self-Service Operations 
the Cloud Becomes 
Expensive Hosting 2.0 
Damon Edwards @damonedwards
@damonedwards
devopscafe.org
Operations 
Tools 
DevOps Consulting 
Automation Design
Conventional Cloud Wisdom
Conventional Cloud Wisdom 
 Saves you time 
…provision infrastructure really really fast
Conventional Cloud Wisdom 
 Saves you time 
…provision infrastructure really really fast 
 Saves you money 
…use only what you need, when needed, then shut off
Conventional Cloud Wisdom 
 Saves you time 
…provision infrastructure really really fast 
 Saves you money 
…use only what you need, when needed, then shut off 
 Saves your job 
…or gets you a new one (#linkedin)
We need to move quicker 
than our competitors
We need to move quicker 
than our competitors 
Cloud
Cloud
Why aren’t we moving quicker 
than our competitors? 
Cloud 
But… 
Server images from x days to z minutes 
y% improved utilization 
t% cheaper storage 
… more ops numbers 
… more ops numbers
Ops 
Cloud
Ops 
Legacy Process and Tooling 
Cloud
No 
difference 
No 
difference 
No 
difference 
PMO Dev QA 
Ops 
Legacy Process and Tooling 
Cloud
No 
difference 
No 
difference 
No 
difference 
PMO Dev QA 
Ops 
Legacy Process and Tooling 
Cloud 
No 
difference
Planning 
Dev Sprints 
Integration 
Infrastructure Procurement and Setup 
Performance Testing 
Security Review 
Prod Release 
Dev Env 
Setup 
Team 1 
Team 2 
Team 3
Planning 
Dev Sprints 
Integration 
Infrastructure Procurement and Setup 
Performance Testing 
Security Review 
Prod Release 
Dev Env 
Setup 
Team 1 
Team 2 
Team 3
Planning 
Dev Sprints 
Integration 
Infrastructure Procurement and Setup 
Performance Testing 
Security Review 
Prod Release 
Dev Env 
Setup 
Team 1 
Team 2 
Team 3 
Have we improved our ability to give 
the customer... 
• What they want 
• When they want it 
• At the lowest cost possible
People 
Process 
Tools
Keep focused on the metrics
Keep focused on the metrics 
Lead Times (and more predictable)
Keep focused on the metrics 
Lead Times (and more predictable) 
MTTD (Mean Time To Detect)
Keep focused on the metrics 
Lead Times (and more predictable) 
MTTD (Mean Time To Detect) 
MTTR (Mean Time to Repair)
Keep focused on the metrics 
Lead Times (and more predictable) 
MTTD (Mean Time To Detect) 
MTTR (Mean Time to Repair) 
Quality at the Source (Less scrap, caught faster)
Slios are the #1 enemy of throughput and quality 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity
Slios are the #1 enemy of throughput and quality 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity
Slios are the #1 enemy of throughput and quality 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
! 
Handoff 
! 
Handoff 
! 
Handoff
Slios are the #1 enemy of throughput and quality 
Handoff 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
! 
Application Knowledge 
! 
Handoff 
! 
Handoff
Slios are the #1 enemy of throughput and quality 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
! 
Application Knowledge 
! 
Operational Knowledge 
! 
Handoff 
Handoff 
Handoff
Slios are the #1 enemy of throughput and quality 
Handoff 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
! 
Application Knowledge 
! 
Operational Knowledge 
Business Intent 
! 
Handoff 
Handoff
Slios are the #1 enemy of throughput and quality 
Handoff 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
! 
Application Knowledge 
! 
Operational Knowledge 
Business Intent 
! 
Handoff 
Handoff 
Ownership 
but limited 
Accountability
Slios are the #1 enemy of throughput and quality 
Handoff 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
! 
Application Knowledge 
! 
Operational Knowledge 
Business Intent 
! 
Handoff 
Handoff 
Ownership 
but limited 
Accountability 
Accountability 
but no 
Ownership
Redraw the organization to eliminate silos 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity
Redraw the organization to eliminate silos 
Cross Functional Delivery Team 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
Cross Functional Delivery Team 
Cross Functional Delivery Team 
Aligned by value streams or 
customer identifiable services
Redraw the organization to eliminate silos 
Cross Functional Delivery Team 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
Cross Functional Delivery Team 
Cross Functional Delivery Team 
Aligned by value streams or 
customer identifiable services 
Freedom & 
Responsibility 
Culture is key 
to enabling
Redraw the organization to eliminate silos 
Cross Functional Delivery Team 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
Cross Functional Delivery Team 
Cross Functional Delivery Team 
Aligned by value streams or 
customer identifiable services 
Freedom & 
Responsibility 
Culture is key 
to enabling 
Do this if 
nothing else!
Redraw the organization to eliminate silos 
Cross Functional Delivery Team 
Dev & Test 
Activity 
Release 
Activity 
Ops 
Activity 
Business 
Activity 
Cross Functional Delivery Team 
Cross Functional Delivery Team 
Aligned by value streams or 
customer identifiable services 
Freedom & 
Responsibility 
Culture is key 
to enabling Google: 
“Cloud Operations at Netflix” 
“Actionable Metrics Netflix” 
Roy Rapoport “DOES14 Netflix youtube" 
Different 
Talk
What do you need to do? 
DevOps
What do you need to do? 
(Hint: Just doing a re-org won’t work) 
DevOps
Turn information flow into artifact flow 
Customer 
Shared Drive 
Test 
Shared 
Drive Prod 
Commits 
Rollout 
Schedule 
README 
MOP 
Release 
Schedule 
PRD 
PRD 
Release 
Memos 
Tasks 
QA Forum 
Ticket 
Remedy 
Ticket 
Estimates 
Technical Support 
Patch 
Calendar 
QA forum 
MOP 
EP(2) 
README 
ERR 
ERR 
MOP, SOP 
PRD 
Design 
Specs 
crit bugs 
email 
Lockdown 
control 
checklist 
M 
New Targets 
Remedy 
Ticket 
Single 
Image 
Server 
XML 
BRD 
ERR 
BTS 
QA 
Environment 
Documentum 
Production 
Packages 
Customer 
communication 
L/T = 28d 
P/T = 7d 
H/C = 1 
S/R = 
Stephen / Xi 
Product Program 
Planning 
L/T = 105d 
P/T = 46d 
H/C = 15 
S/R = 100% 
John Robert 
Release Program 
Management 
L/T = 
P/T = 
H/C = 
S/R = 
Erica Smith 
Engineering 
Planning 
Process 
L/T = 45d 
P/T = 18d 
H/C = 23 
S/R = 
Preliminary Bob Smith 
Development 
L/T = 45d 
P/T = 21d 
H/C = 140 
S/R = 
Bob Smith 
Full 
Development 
L/T = 75d 
P/T = 43d 
H/C = 130 
S/R = 
Bob Smith 
Build 
L/T = 1d 
P/T = 0.3d 
H/C = 2 
S/R = 33% 
John Doe 
D 
Selective 
Promotion 
L/T = 90d 
P/T = 15d 
H/C = 5 
S/R = 
Steve Young 
QA Test 
L/T = 105d 
P/T = 11d 
H/C = 42 
S/R = 
Sam Young 
Engineering 
Release 
L/T = 60d 
P/T = 1d 
H/C = 1 
S/R = >5% 
Victoria Doe 
Release 
Promotion 
L/T = 60d 
P/T = 0.2d 
H/C = 1 
S/R = >5% 
Victoria Doe 
Cloud Services 
Release 
L/T = 60d 
P/T = 16d 
H/C = 3 
S/R = 3% 
Reggie / Carlos 
Change Control 
L/T = 42d 
P/T = 
H/C = 
S/R = 
Peter Lee 
Deploy Release 
L/T = 90d 
P/T = 8d 
H/C = 8 
S/R = 2% 
Lewis S./Peter Y. 
Server 
Provisioning 
L/T = 24d 
P/T = 4d 
H/C = 3 
S/R = 50% 
Jen Garza 
BRD 
Server 
Acceptance 
L/T = 14d 
P/T = 1d 
H/C = 4.5 
S/R = 15% 
Lynn A. etc 
derived 
reqs. 
PRD 
QA Forum 
Ticket 
Service pack 
review 
L/T = 56d 
P/T = 7d 
H/C = 6 
S/R = 100% 
Suresh Wu 
M 
PD(3) 
PD(3) 
M EP 
PD 
M(3) 
M W(2) 
TS 
M(3) 
M(2) 
W(2) 
M(2) 
EP 
EP 
EP(3) 
W 
W 
EP 
W 
PD 
TS(2) 
M 
M 
M(2) 
M W(2) EP 
D 
M(3) EP 
W 
EP 
PD D(3) 
Current state value stream map 
L/T Lead time 
P/T Process time 
H/C Head count 
S/R Scrap rate 
D Defects 
EP Extra processes 
M Motion 
PD Partially done 
TS Task switching 
W Waiting 
Product Management 
Engineering 
Cloud Services
Turn information flow into artifact flow 
Customer 
Shared Drive 
Test 
Shared 
Drive Prod 
Commits 
Rollout 
Schedule 
README 
MOP 
Release 
Schedule 
PRD 
PRD 
Release 
Memos 
Tasks 
QA Forum 
Ticket 
Remedy 
Ticket 
Estimates 
Technical Support 
Patch 
Calendar 
QA forum 
MOP 
EP(2) 
README 
ERR 
ERR 
MOP, SOP 
PRD 
Design 
Specs 
crit bugs 
email 
Lockdown 
control 
checklist 
M 
New Targets 
Remedy 
Ticket 
Single 
Image 
Server 
XML 
BRD 
ERR 
BTS 
QA 
Environment 
Documentum 
Production 
Packages 
Customer 
communication 
L/T = 28d 
P/T = 7d 
H/C = 1 
S/R = 
Stephen / Xi 
Product Program 
Planning 
L/T = 105d 
P/T = 46d 
H/C = 15 
S/R = 100% 
John Robert 
Release Program 
Management 
L/T = 
P/T = 
H/C = 
S/R = 
Erica Smith 
Engineering 
Planning 
Process 
L/T = 45d 
P/T = 18d 
H/C = 23 
S/R = 
Preliminary Bob Smith 
Development 
L/T = 45d 
P/T = 21d 
H/C = 140 
S/R = 
Bob Smith 
Full 
Development 
L/T = 75d 
P/T = 43d 
H/C = 130 
S/R = 
Bob Smith 
Build 
L/T = 1d 
P/T = 0.3d 
H/C = 2 
S/R = 33% 
John Doe 
D 
Selective 
Promotion 
L/T = 90d 
P/T = 15d 
H/C = 5 
S/R = 
Steve Young 
QA Test 
L/T = 105d 
P/T = 11d 
H/C = 42 
S/R = 
Sam Young 
Engineering 
Release 
L/T = 60d 
P/T = 1d 
H/C = 1 
S/R = >5% 
Victoria Doe 
Release 
Promotion 
L/T = 60d 
P/T = 0.2d 
H/C = 1 
S/R = >5% 
Victoria Doe 
Cloud Services 
Release 
L/T = 60d 
P/T = 16d 
H/C = 3 
S/R = 3% 
Reggie / Carlos 
Change Control 
L/T = 42d 
P/T = 
H/C = 
S/R = 
Peter Lee 
Deploy Release 
L/T = 90d 
P/T = 8d 
H/C = 8 
S/R = 2% 
Lewis S./Peter Y. 
Server 
Provisioning 
L/T = 24d 
P/T = 4d 
H/C = 3 
S/R = 50% 
Jen Garza 
BRD 
Server 
Acceptance 
L/T = 14d 
P/T = 1d 
H/C = 4.5 
S/R = 15% 
Lynn A. etc 
derived 
reqs. 
PRD 
QA Forum 
Ticket 
Service pack 
review 
L/T = 56d 
P/T = 7d 
H/C = 6 
S/R = 100% 
Suresh Wu 
M 
PD(3) 
PD(3) 
M EP 
PD 
M(3) 
M W(2) 
TS 
M(3) 
M(2) 
W(2) 
M(2) 
EP 
EP 
EP(3) 
W 
W 
EP 
W 
PD 
TS(2) 
M 
M 
M(2) 
M W(2) EP 
D 
M(3) EP 
W 
EP 
PD D(3) 
Current state value stream map 
L/T Lead time 
P/T Process time 
H/C Head count 
S/R Scrap rate 
D Defects 
EP Extra processes 
M Motion 
PD Partially done 
TS Task switching 
W Waiting 
Product Management 
Engineering 
Cloud Services
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Normalize 
Standardize
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Software Environment 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Physical Environment 
Server Environment 
Verification 
Cloud Environment 
Normalize 
Standardize
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Software Environment 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Physical Environment 
Server Environment 
Verification 
Cloud Environment 
Normalize 
Standardize 
Verification Point 
Verification Point
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Software Environment 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Physical Environment 
Server Environment 
Verification 
Cloud Environment 
Normalize 
Standardize 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Embed in the rest of the process!!
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Software Environment 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Physical Environment 
Server Environment 
Verification 
Cloud Environment 
Normalize 
Standardize 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Embed in the rest of the process!!
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Software Environment 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Physical Environment 
Server Environment 
Verification 
Cloud Environment 
Normalize 
Standardize 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Embed in the rest of the process!! 
•Outside-in perspective on testing (“is this thing working?”)
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Software Environment 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Physical Environment 
Server Environment 
Verification 
Cloud Environment 
Normalize 
Standardize 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Embed in the rest of the process!! 
•Outside-in perspective on testing (“is this thing working?”) 
•No work output complete without a verification test (code not docs!)
Insert verification points to drive feedback loops 
Business Need: 
Customer Capactiy AZ Redundancy 
Site Build 
Value Demand 
Product Manager 
-Partnerships 
-Biz Modeling 
-Hardware 
Acquisition 
6 months 
"Acme Partner" 
Capacity 
-Pick Site (Colo, 
Acme Partner) 
-Physical security 
with InfoSec, 
Contractors 
Procure Hardware 
-Ship to Uranus 
-Cabling and 
power plugs 
1.5 - 2 months 
BOM 
NetArch 
ComputeArch 
StorageArch 
Security 
email 
different parties 
Customizations Ad-hoc 
Redesign 
-5 chassis to 4 
chassis to save $$ 
-No firewall 
Ops Admin 
Rack Elevations 
Patch Diagrams 
Physical Layouts 
1 week 
correct 
cables? 
"must be "latest & 
greatest" gear!" 
FooCorp (Contractor) 
Box.net 
+ Ops Admin 
Rack and Stack 
Elevations 
.vs 
Patch 
Plan 
.xls 
-Rack/Cable 
-Network / ? 
-Labeling 
Build 
spec? 
Build Lead 
Net Ops Admin 
-Net connections 
-VLANs 
-Subnets 
-IP Addressing 
<<Scriptable>> 
3w 2d 
Generic 
Low Level 
Design .xls 
Ops Admin 
Interprets 
Compute and Storage 
Build Team 
-Hardware Manager 
Config/Profile 
-Prepare OS Install 
-Power failover 
testing 
<<Scriptable>> 
2-3d 
Ops Admin Compute 
and Storage Build 
2w 
-Setup OpenStack, 
CEPH 
-Prepare Build Server 
with software repository 
(ubuntu) 
-Cobbler node, PXE 
booting 
-Puppet Master 
-Reprovisioning 
2d 
via cell 
net 
Files and 
Packages 
Ubuntu OS 
Openstack 
.ISV .IMG 
rsync 
scp 
Puppet 
Software Environment 
Server Build 
Code 
COI 
Accumulation 
Delay 
Manifests 
Heira 
Cobbler yaml 
Retrofit 
Jenkins 
SSD? 
Customer? 
Git 
Dev, Ops 
Adding 
Changes 
Acme Partner 
Seeding Env. + Services 
-Reqs 
-Deliver 
Schedule 
-Network 
-Zoning 
-Mail Server 
-Prime Service Catalog 
-6 VN 
-Access 
-Flavors 
-Tennants 
-Authorization 
-Repo Mirror 
-ID Management System (5d) 
-LDAP server 
-2 Factor Auth / Radint 
-Licenses (SQL Server) 
-Images 
-Load Balancer Setup 
-Monitoring 
PM (Skipper) 
Ops 
Devs 
A&E 
-Install HA for each 
product 
-Web portal 
-Databases 
-A.D. 
-Prime Catalog 
3 weeks 
Platform Validation 
Ops Admin 
-Functionality testing 
-Boot VMs 
-Check network 
-Capacity testing 
-Synthetic testing 
Tempest test cases 
Register Support 
Contracts for CEPH, 
Canonical, TAC 
Punchlist 
.xls 
Box.net 
PM (Lewis) 
Site specific 
services and ACLs 
Weekly 
Status 
Meeting 
Acme Partner PM 
-Handoff Meeting to 
Acme Partner 
-Retrospective 
Ad-hoc fix 
push 
Scrum 
Fixed in 
sprint 
Rally 
Openstack 
Bug 
Dev, Ops 
QA (SDU) 
Systems 
Dev Unit 
-Testing (destructive, 
non-destructive) 
-Restore DB 
3-4 weeks 
Heira 
Environment Data 
Defects 
InfoSec 
-IDS Infra 
-Install scanner 
-Vulnerability scan 
-3rd party PenTest 
Account 
Manager 
(Bob) 
4 months 
Physical Environment 
Server Environment 
Verification 
Cloud Environment 
Normalize 
Standardize 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Verification Point 
Embed in the rest of the process!! 
•Outside-in perspective on testing (“is this thing working?”) 
•No work output complete without a verification test (code not docs!) 
•Start with simple shell scripts (“lingua franca” of ops)
Drive all changes through a SDLC 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE
Drive all changes through a SDLC 
Code 
Dev Ops * 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Collaboration 
Tests
Drive all changes through a SDLC 
Versioned Release 
Code 
Tests 
Dev Ops * 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Collaboration
Versioned Release 
Code 
Tests 
Dev Ops * 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Collaboration 
Dev Ops * 
Execute 
Operations 
Procedures 
Drive all changes through a SDLC
Versioned Release 
Code 
Tests 
Dev Ops * 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Collaboration 
Dev Ops * 
Execute 
Operations 
Procedures 
Drive all changes through a SDLC 
Same People!!
Versioned Release 
Code 
Tests 
Dev Ops * 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
SERVICE 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Collaboration 
Dev Ops * 
Execute 
Operations 
Procedures 
Drive all changes through a SDLC 
Same People!!
What about cross-cutting concerns? 
Cross Functional Delivery Team 
(PO • Dev • Test • SRE) 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Packages Environments 
SOURCE 
Monitoring 
QA Security Environments 
--- Metrics
What about cross-cutting concerns? 
Cross Functional Delivery Team 
(PO • Dev • Test • SRE) 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Packages Environments 
SOURCE 
Monitoring 
QA Security Environments 
--- Metrics 
QA as a 
Service 
Security as a 
Service 
Metrics as a 
Service 
Env. as a 
Service
What about cross-cutting concerns? 
Cross Functional Delivery Team 
(PO • Dev • Test • SRE) 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Packages Environments 
SOURCE 
pull pull pull pull 
Monitoring 
QA Security Environments 
--- Metrics 
QA as a 
Service 
Security as a 
Service 
Metrics as a 
Service 
Env. as a 
Service
Be an internal service provider 
pull 
Cross-Cutting 
Concern X 
✓ Standardized offerings 
✓ Pulled by users (not pushed) 
✓ On-demand and self-service 
✓ Implementation knowledge not 
necessary for normal use 
✓ Provider spends their time building 
service and coaching users 
X as a Service
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings 
2 Tame the tool sprawl
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings 
2 Tame the tool sprawl
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings 
2 Tame the tool sprawl 
3 Setup self-service interfaces
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings 
2 Tame the tool sprawl 
3 Setup self-service interfaces
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings 
2 Tame the tool sprawl 
3 Setup self-service interfaces 
4 Setup secure access
Start working like an internal service provider 
pull 
X as a Service 
Cross-Cutting 
Concern X 
1 Define your offerings 
2 Tame the tool sprawl 
3 Setup self-service interfaces 
4 Setup secure access
pull 
X as a Service 
Cross-Cutting 
Concern X 
Start working like an internal service provider 
Plug: Give Rundeck a try --> rundeck.org 
1 Define your offerings 
2 Tame the tool sprawl 
3 Setup self-service interfaces 
4 Setup secure access
What about things that 
can’t be automated? 
DevOps
Good rule of thumb: 
Tickets are for exceptions, not the daily work 
X 
X 
Ticket 
System 
?? 
X
Good rule of thumb: 
Tickets are for exceptions, not the daily work 
Manual request queues lead to... 
• Bottlenecks 
• Increased lead times 
• Reinforces organizational silos 
• Misinterpretation or omissions 
X 
X 
Ticket 
System 
?? 
X
How do we mitigate the negative 
impact of manual request queues? 
DevOps
Use a work management system like Kanban 
Up Next 
Service B 
Service C 
Service D 
Service E 
Doing 
Plan it Do it Review it Post Mortem 
Backlog 
prioritized by 
stakeholders 
Ta s k 
Task 
Service A 
Task 
Task 
Task 
Task Task 
Emergency - Type 1 
Emergency - Type 2 
Task 
Task 
Task 
Task 
Task 
Task Task 
Task Task Task 
Task 
Task 
Task 
Task Task 
Task Task 
Task 
Task
Use a work management system like Kanban 
Up Next 
Service B 
Service C 
Service D 
Service E 
Doing 
Plan it Do it Review it Post Mortem 
Backlog 
prioritized by 
stakeholders 
Ta s k 
Task 
Service A 
Task 
Task 
Task 
Task Task 
Emergency - Type 1 
Emergency - Type 2 
Task 
Task 
Task 
Task 
Task 
Task Task 
Task Task Task 
Task 
Task 
Task 
Task Task 
Task Task 
Task 
Task 
Only works if you set 
and enforce: 
• Service catalog and 
backlog rules 
• WIP and SLA per 
service type 
• WIP per person
Use a work management system like Kanban 
Your standardized 
offerings 
Up Next 
Service B 
Service C 
Service D 
Service E 
Doing 
Plan it Do it Review it Post Mortem 
Backlog 
prioritized by 
stakeholders 
Ta s k 
Task 
Service A 
Task 
Task 
Task 
Task Task 
Emergency - Type 1 
Emergency - Type 2 
Task 
Task 
Task 
Task 
Task 
Task Task 
Task Task Task 
Task 
Task 
Task 
Task Task 
Task Task 
Task 
Task 
Only works if you set 
and enforce: 
• Service catalog and 
backlog rules 
• WIP and SLA per 
service type 
• WIP per person
Use a work management system like Kanban 
Your standardized 
offerings 
Up Next 
Service B 
Service C 
Service D 
Service E 
Doing 
Plan it Do it Review it Post Mortem 
Backlog 
prioritized by 
stakeholders 
Ta s k 
Task 
Service A 
Task 
Task 
Task 
Task Task 
Emergency - Type 1 
Emergency - Type 2 
Task 
Task 
Task 
Task 
Task 
Task Task 
Task Task Task 
Task 
Task 
Task 
Task Task 
Task Task 
Task 
Task 
Only works if you set 
and enforce: 
• Service catalog and 
backlog rules 
• WIP and SLA per 
service type 
• WIP per person 
SLA per service type
Use a work management system like Kanban 
Your standardized 
offerings 
Up Next 
Service B 
Service C 
Service D 
Service E 
Doing 
Plan it Do it Review it Post Mortem 
Backlog 
prioritized by 
stakeholders 
Ta s k 
Task 
Service A 
Task 
Task 
Task 
Task Task 
Emergency - Type 1 
Emergency - Type 2 
Task 
Task 
Task 
Task 
Task 
Task Task 
Task Task Task 
Task 
Task 
Task 
Task Task 
Task Task 
Task 
Task 
Only works if you set 
and enforce: 
• Service catalog and 
backlog rules 
• WIP and SLA per 
service type 
• WIP per person 
SLA per service type 
Enforce WIP to protect capacity and hit commitments!
Unlimited Environments* 
(* OK, yes… nothing is ever unlimited)
Unlimited Environments* 
• Hardware is cheap… people and 
opportunity costs are expensive 
(* OK, yes… nothing is ever unlimited)
Unlimited Environments* 
• Hardware is cheap… people and 
opportunity costs are expensive 
• Shared integration environments 
become choke points 
(* OK, yes… nothing is ever unlimited)
Unlimited Environments* 
• Hardware is cheap… people and 
opportunity costs are expensive 
• Shared integration environments 
become choke points 
• The more environments people have, 
the more experiments they run 
(* OK, yes… nothing is ever unlimited)
Unlimited Environments* 
• Hardware is cheap… people and 
opportunity costs are expensive 
• Shared integration environments 
become choke points 
• The more environments people have, 
the more experiments they run 
• The more production-similar 
environments people have, the higher 
the quality of organization 
(* OK, yes… nothing is ever unlimited)
..But Security! ...But Compliance! 
DevOps
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Design and Code 
Reviews
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
Design and Code 
Reviews 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Code and Binary 
Scanning
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
Design and Code 
Reviews 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Code and Binary 
Scanning 
“Bake” security 
tests into your 
“immune system”
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
Design and Code 
Reviews 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Code and Binary 
Scanning 
“Bake” security 
tests into your 
“immune system” 
Component 
vulnerability and 
governance
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
Design and Code 
Reviews 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
Code and Binary 
Scanning 
“Bake” security 
tests into your 
“immune system” 
Component 
vulnerability and 
governance 
Access policy and 
operational security 
checks
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
What’s the 
change?
Security and Compliance Opportunity 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
What’s the 
change? 
How did you validate 
the change?
Security and Compliance Opportunity 
How did you validate 
the change? 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
What’s the 
change? 
Where did the 
change go?
Security and Compliance Opportunity 
How did you validate 
the change? 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
What’s the 
change? 
Where did the 
change go? 
Who has access to 
what environment? 
Who did what when 
and where?
Security and Compliance Opportunity 
How did you validate 
the change? 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
What was 
executed on the 
box to make the 
change? 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
What’s the 
change? 
Where did the 
change go? 
Who has access to 
what environment? 
Who did what when 
and where?
Security and Compliance Opportunity 
How did you validate 
the change? 
Tests Code 
Source 
Repo 
Config Env 
Spec 
Run-book 
Auto-mation 
CI 
Server 
Package 
Repo 
Operations 
Console 
What was 
executed on the 
box to make the 
change? 
Shell 
Powershell 
Pre-Production 
Environments 
Shell 
Powershell 
Production 
Environment 
Packages 
Operations 
Development 
SOURCE 
What’s the 
change? 
Where did the 
change go? 
Who has access to 
what environment? 
Who did what when 
and where? 
Change things 
here 
Run / control 
things here
Recap
Recap 
• Redraw the org to eliminate silos!
Recap 
• Redraw the org to eliminate silos! 
• Turn information flow into artifact flow
Recap 
• Redraw the org to eliminate silos! 
• Turn information flow into artifact flow 
• Insert verification points to tighten 
feedback loops
Recap 
• Redraw the org to eliminate silos! 
• Turn information flow into artifact flow 
• Insert verification points to tighten 
feedback loops 
• Drive all changes through a SDLC
Recap 
• Redraw the org to eliminate silos! 
• Turn information flow into artifact flow 
• Insert verification points to tighten 
feedback loops 
• Drive all changes through a SDLC 
• Turn cross-cutting concerns into internal 
service providers
Recap 
• Redraw the org to eliminate silos! 
• Turn information flow into artifact flow 
• Insert verification points to tighten 
feedback loops 
• Drive all changes through a SDLC 
• Turn cross-cutting concerns into internal 
service providers 
• Strive for unlimited environments
Bonus: DevOps Litmus Test
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality?
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated?
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated? 
Eliminate manual information flow and replace 
with tool-to-tool artifact flow?
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated? 
Eliminate manual information flow and replace 
with tool-to-tool artifact flow? 
Eliminate manually-fulfilled request queues and 
other sources of waiting and context 
switching?
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated? 
Eliminate manual information flow and replace 
with tool-to-tool artifact flow? 
Eliminate manually-fulfilled request queues and 
other sources of waiting and context 
switching? 
Improve awareness and understanding of how 
work is flowing of the end-to-end lifecycle?
Bonus: DevOps Litmus Test 
1 or more marked “NO”? Then back to the drawing board!
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
1 or more marked “NO”? Then back to the drawing board!
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated? 
1 or more marked “NO”? Then back to the drawing board!
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated? 
Eliminate manual information flow and replace 
with tool-to-tool artifact flow? 
1 or more marked “NO”? Then back to the drawing board!
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated? 
Eliminate manual information flow and replace 
with tool-to-tool artifact flow? 
Eliminate manually-fulfilled request queues and 
other sources of waiting and context 
switching? 
1 or more marked “NO”? Then back to the drawing board!
Bonus: DevOps Litmus Test 
Reduce cycle time AND improve quality? 
Eliminate handoffs or reduce the friction of 
those handoffs that can't be eliminated? 
Eliminate manual information flow and replace 
with tool-to-tool artifact flow? 
Eliminate manually-fulfilled request queues and 
other sources of waiting and context 
switching? 
Improve awareness and understanding of how 
work is flowing of the end-to-end lifecycle? 
1 or more marked “NO”? Then back to the drawing board!
@damonedwards 
damon@dtosolutions.com

Without Self-Service Operations, the Cloud is Just Expensive Hosting 2.0 - (a DevOps story)

  • 1.
    Without Self-Service Operations the Cloud Becomes Expensive Hosting 2.0 Damon Edwards @damonedwards
  • 2.
  • 3.
  • 4.
    Operations Tools DevOpsConsulting Automation Design
  • 5.
  • 6.
    Conventional Cloud Wisdom  Saves you time …provision infrastructure really really fast
  • 7.
    Conventional Cloud Wisdom  Saves you time …provision infrastructure really really fast  Saves you money …use only what you need, when needed, then shut off
  • 8.
    Conventional Cloud Wisdom  Saves you time …provision infrastructure really really fast  Saves you money …use only what you need, when needed, then shut off  Saves your job …or gets you a new one (#linkedin)
  • 9.
    We need tomove quicker than our competitors
  • 10.
    We need tomove quicker than our competitors Cloud
  • 11.
  • 12.
    Why aren’t wemoving quicker than our competitors? Cloud But… Server images from x days to z minutes y% improved utilization t% cheaper storage … more ops numbers … more ops numbers
  • 13.
  • 14.
    Ops Legacy Processand Tooling Cloud
  • 15.
    No difference No difference No difference PMO Dev QA Ops Legacy Process and Tooling Cloud
  • 16.
    No difference No difference No difference PMO Dev QA Ops Legacy Process and Tooling Cloud No difference
  • 17.
    Planning Dev Sprints Integration Infrastructure Procurement and Setup Performance Testing Security Review Prod Release Dev Env Setup Team 1 Team 2 Team 3
  • 18.
    Planning Dev Sprints Integration Infrastructure Procurement and Setup Performance Testing Security Review Prod Release Dev Env Setup Team 1 Team 2 Team 3
  • 19.
    Planning Dev Sprints Integration Infrastructure Procurement and Setup Performance Testing Security Review Prod Release Dev Env Setup Team 1 Team 2 Team 3 Have we improved our ability to give the customer... • What they want • When they want it • At the lowest cost possible
  • 20.
  • 21.
    Keep focused onthe metrics
  • 22.
    Keep focused onthe metrics Lead Times (and more predictable)
  • 23.
    Keep focused onthe metrics Lead Times (and more predictable) MTTD (Mean Time To Detect)
  • 24.
    Keep focused onthe metrics Lead Times (and more predictable) MTTD (Mean Time To Detect) MTTR (Mean Time to Repair)
  • 25.
    Keep focused onthe metrics Lead Times (and more predictable) MTTD (Mean Time To Detect) MTTR (Mean Time to Repair) Quality at the Source (Less scrap, caught faster)
  • 26.
    Slios are the#1 enemy of throughput and quality Dev & Test Activity Release Activity Ops Activity Business Activity
  • 27.
    Slios are the#1 enemy of throughput and quality Dev & Test Activity Release Activity Ops Activity Business Activity
  • 28.
    Slios are the#1 enemy of throughput and quality Dev & Test Activity Release Activity Ops Activity Business Activity ! Handoff ! Handoff ! Handoff
  • 29.
    Slios are the#1 enemy of throughput and quality Handoff Dev & Test Activity Release Activity Ops Activity Business Activity ! Application Knowledge ! Handoff ! Handoff
  • 30.
    Slios are the#1 enemy of throughput and quality Dev & Test Activity Release Activity Ops Activity Business Activity ! Application Knowledge ! Operational Knowledge ! Handoff Handoff Handoff
  • 31.
    Slios are the#1 enemy of throughput and quality Handoff Dev & Test Activity Release Activity Ops Activity Business Activity ! Application Knowledge ! Operational Knowledge Business Intent ! Handoff Handoff
  • 32.
    Slios are the#1 enemy of throughput and quality Handoff Dev & Test Activity Release Activity Ops Activity Business Activity ! Application Knowledge ! Operational Knowledge Business Intent ! Handoff Handoff Ownership but limited Accountability
  • 33.
    Slios are the#1 enemy of throughput and quality Handoff Dev & Test Activity Release Activity Ops Activity Business Activity ! Application Knowledge ! Operational Knowledge Business Intent ! Handoff Handoff Ownership but limited Accountability Accountability but no Ownership
  • 34.
    Redraw the organizationto eliminate silos Dev & Test Activity Release Activity Ops Activity Business Activity
  • 35.
    Redraw the organizationto eliminate silos Cross Functional Delivery Team Dev & Test Activity Release Activity Ops Activity Business Activity Cross Functional Delivery Team Cross Functional Delivery Team Aligned by value streams or customer identifiable services
  • 36.
    Redraw the organizationto eliminate silos Cross Functional Delivery Team Dev & Test Activity Release Activity Ops Activity Business Activity Cross Functional Delivery Team Cross Functional Delivery Team Aligned by value streams or customer identifiable services Freedom & Responsibility Culture is key to enabling
  • 37.
    Redraw the organizationto eliminate silos Cross Functional Delivery Team Dev & Test Activity Release Activity Ops Activity Business Activity Cross Functional Delivery Team Cross Functional Delivery Team Aligned by value streams or customer identifiable services Freedom & Responsibility Culture is key to enabling Do this if nothing else!
  • 38.
    Redraw the organizationto eliminate silos Cross Functional Delivery Team Dev & Test Activity Release Activity Ops Activity Business Activity Cross Functional Delivery Team Cross Functional Delivery Team Aligned by value streams or customer identifiable services Freedom & Responsibility Culture is key to enabling Google: “Cloud Operations at Netflix” “Actionable Metrics Netflix” Roy Rapoport “DOES14 Netflix youtube" Different Talk
  • 39.
    What do youneed to do? DevOps
  • 40.
    What do youneed to do? (Hint: Just doing a re-org won’t work) DevOps
  • 41.
    Turn information flowinto artifact flow Customer Shared Drive Test Shared Drive Prod Commits Rollout Schedule README MOP Release Schedule PRD PRD Release Memos Tasks QA Forum Ticket Remedy Ticket Estimates Technical Support Patch Calendar QA forum MOP EP(2) README ERR ERR MOP, SOP PRD Design Specs crit bugs email Lockdown control checklist M New Targets Remedy Ticket Single Image Server XML BRD ERR BTS QA Environment Documentum Production Packages Customer communication L/T = 28d P/T = 7d H/C = 1 S/R = Stephen / Xi Product Program Planning L/T = 105d P/T = 46d H/C = 15 S/R = 100% John Robert Release Program Management L/T = P/T = H/C = S/R = Erica Smith Engineering Planning Process L/T = 45d P/T = 18d H/C = 23 S/R = Preliminary Bob Smith Development L/T = 45d P/T = 21d H/C = 140 S/R = Bob Smith Full Development L/T = 75d P/T = 43d H/C = 130 S/R = Bob Smith Build L/T = 1d P/T = 0.3d H/C = 2 S/R = 33% John Doe D Selective Promotion L/T = 90d P/T = 15d H/C = 5 S/R = Steve Young QA Test L/T = 105d P/T = 11d H/C = 42 S/R = Sam Young Engineering Release L/T = 60d P/T = 1d H/C = 1 S/R = >5% Victoria Doe Release Promotion L/T = 60d P/T = 0.2d H/C = 1 S/R = >5% Victoria Doe Cloud Services Release L/T = 60d P/T = 16d H/C = 3 S/R = 3% Reggie / Carlos Change Control L/T = 42d P/T = H/C = S/R = Peter Lee Deploy Release L/T = 90d P/T = 8d H/C = 8 S/R = 2% Lewis S./Peter Y. Server Provisioning L/T = 24d P/T = 4d H/C = 3 S/R = 50% Jen Garza BRD Server Acceptance L/T = 14d P/T = 1d H/C = 4.5 S/R = 15% Lynn A. etc derived reqs. PRD QA Forum Ticket Service pack review L/T = 56d P/T = 7d H/C = 6 S/R = 100% Suresh Wu M PD(3) PD(3) M EP PD M(3) M W(2) TS M(3) M(2) W(2) M(2) EP EP EP(3) W W EP W PD TS(2) M M M(2) M W(2) EP D M(3) EP W EP PD D(3) Current state value stream map L/T Lead time P/T Process time H/C Head count S/R Scrap rate D Defects EP Extra processes M Motion PD Partially done TS Task switching W Waiting Product Management Engineering Cloud Services
  • 42.
    Turn information flowinto artifact flow Customer Shared Drive Test Shared Drive Prod Commits Rollout Schedule README MOP Release Schedule PRD PRD Release Memos Tasks QA Forum Ticket Remedy Ticket Estimates Technical Support Patch Calendar QA forum MOP EP(2) README ERR ERR MOP, SOP PRD Design Specs crit bugs email Lockdown control checklist M New Targets Remedy Ticket Single Image Server XML BRD ERR BTS QA Environment Documentum Production Packages Customer communication L/T = 28d P/T = 7d H/C = 1 S/R = Stephen / Xi Product Program Planning L/T = 105d P/T = 46d H/C = 15 S/R = 100% John Robert Release Program Management L/T = P/T = H/C = S/R = Erica Smith Engineering Planning Process L/T = 45d P/T = 18d H/C = 23 S/R = Preliminary Bob Smith Development L/T = 45d P/T = 21d H/C = 140 S/R = Bob Smith Full Development L/T = 75d P/T = 43d H/C = 130 S/R = Bob Smith Build L/T = 1d P/T = 0.3d H/C = 2 S/R = 33% John Doe D Selective Promotion L/T = 90d P/T = 15d H/C = 5 S/R = Steve Young QA Test L/T = 105d P/T = 11d H/C = 42 S/R = Sam Young Engineering Release L/T = 60d P/T = 1d H/C = 1 S/R = >5% Victoria Doe Release Promotion L/T = 60d P/T = 0.2d H/C = 1 S/R = >5% Victoria Doe Cloud Services Release L/T = 60d P/T = 16d H/C = 3 S/R = 3% Reggie / Carlos Change Control L/T = 42d P/T = H/C = S/R = Peter Lee Deploy Release L/T = 90d P/T = 8d H/C = 8 S/R = 2% Lewis S./Peter Y. Server Provisioning L/T = 24d P/T = 4d H/C = 3 S/R = 50% Jen Garza BRD Server Acceptance L/T = 14d P/T = 1d H/C = 4.5 S/R = 15% Lynn A. etc derived reqs. PRD QA Forum Ticket Service pack review L/T = 56d P/T = 7d H/C = 6 S/R = 100% Suresh Wu M PD(3) PD(3) M EP PD M(3) M W(2) TS M(3) M(2) W(2) M(2) EP EP EP(3) W W EP W PD TS(2) M M M(2) M W(2) EP D M(3) EP W EP PD D(3) Current state value stream map L/T Lead time P/T Process time H/C Head count S/R Scrap rate D Defects EP Extra processes M Motion PD Partially done TS Task switching W Waiting Product Management Engineering Cloud Services
  • 43.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Normalize Standardize
  • 44.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Software Environment Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Physical Environment Server Environment Verification Cloud Environment Normalize Standardize
  • 45.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Software Environment Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Physical Environment Server Environment Verification Cloud Environment Normalize Standardize Verification Point Verification Point
  • 46.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Software Environment Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Physical Environment Server Environment Verification Cloud Environment Normalize Standardize Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Embed in the rest of the process!!
  • 47.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Software Environment Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Physical Environment Server Environment Verification Cloud Environment Normalize Standardize Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Embed in the rest of the process!!
  • 48.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Software Environment Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Physical Environment Server Environment Verification Cloud Environment Normalize Standardize Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Embed in the rest of the process!! •Outside-in perspective on testing (“is this thing working?”)
  • 49.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Software Environment Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Physical Environment Server Environment Verification Cloud Environment Normalize Standardize Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Embed in the rest of the process!! •Outside-in perspective on testing (“is this thing working?”) •No work output complete without a verification test (code not docs!)
  • 50.
    Insert verification pointsto drive feedback loops Business Need: Customer Capactiy AZ Redundancy Site Build Value Demand Product Manager -Partnerships -Biz Modeling -Hardware Acquisition 6 months "Acme Partner" Capacity -Pick Site (Colo, Acme Partner) -Physical security with InfoSec, Contractors Procure Hardware -Ship to Uranus -Cabling and power plugs 1.5 - 2 months BOM NetArch ComputeArch StorageArch Security email different parties Customizations Ad-hoc Redesign -5 chassis to 4 chassis to save $$ -No firewall Ops Admin Rack Elevations Patch Diagrams Physical Layouts 1 week correct cables? "must be "latest & greatest" gear!" FooCorp (Contractor) Box.net + Ops Admin Rack and Stack Elevations .vs Patch Plan .xls -Rack/Cable -Network / ? -Labeling Build spec? Build Lead Net Ops Admin -Net connections -VLANs -Subnets -IP Addressing <<Scriptable>> 3w 2d Generic Low Level Design .xls Ops Admin Interprets Compute and Storage Build Team -Hardware Manager Config/Profile -Prepare OS Install -Power failover testing <<Scriptable>> 2-3d Ops Admin Compute and Storage Build 2w -Setup OpenStack, CEPH -Prepare Build Server with software repository (ubuntu) -Cobbler node, PXE booting -Puppet Master -Reprovisioning 2d via cell net Files and Packages Ubuntu OS Openstack .ISV .IMG rsync scp Puppet Software Environment Server Build Code COI Accumulation Delay Manifests Heira Cobbler yaml Retrofit Jenkins SSD? Customer? Git Dev, Ops Adding Changes Acme Partner Seeding Env. + Services -Reqs -Deliver Schedule -Network -Zoning -Mail Server -Prime Service Catalog -6 VN -Access -Flavors -Tennants -Authorization -Repo Mirror -ID Management System (5d) -LDAP server -2 Factor Auth / Radint -Licenses (SQL Server) -Images -Load Balancer Setup -Monitoring PM (Skipper) Ops Devs A&E -Install HA for each product -Web portal -Databases -A.D. -Prime Catalog 3 weeks Platform Validation Ops Admin -Functionality testing -Boot VMs -Check network -Capacity testing -Synthetic testing Tempest test cases Register Support Contracts for CEPH, Canonical, TAC Punchlist .xls Box.net PM (Lewis) Site specific services and ACLs Weekly Status Meeting Acme Partner PM -Handoff Meeting to Acme Partner -Retrospective Ad-hoc fix push Scrum Fixed in sprint Rally Openstack Bug Dev, Ops QA (SDU) Systems Dev Unit -Testing (destructive, non-destructive) -Restore DB 3-4 weeks Heira Environment Data Defects InfoSec -IDS Infra -Install scanner -Vulnerability scan -3rd party PenTest Account Manager (Bob) 4 months Physical Environment Server Environment Verification Cloud Environment Normalize Standardize Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Verification Point Embed in the rest of the process!! •Outside-in perspective on testing (“is this thing working?”) •No work output complete without a verification test (code not docs!) •Start with simple shell scripts (“lingua franca” of ops)
  • 51.
    Drive all changesthrough a SDLC Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE
  • 52.
    Drive all changesthrough a SDLC Code Dev Ops * Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Collaboration Tests
  • 53.
    Drive all changesthrough a SDLC Versioned Release Code Tests Dev Ops * Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Collaboration
  • 54.
    Versioned Release Code Tests Dev Ops * Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Collaboration Dev Ops * Execute Operations Procedures Drive all changes through a SDLC
  • 55.
    Versioned Release Code Tests Dev Ops * Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Collaboration Dev Ops * Execute Operations Procedures Drive all changes through a SDLC Same People!!
  • 56.
    Versioned Release Code Tests Dev Ops * Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo SERVICE Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Collaboration Dev Ops * Execute Operations Procedures Drive all changes through a SDLC Same People!!
  • 57.
    What about cross-cuttingconcerns? Cross Functional Delivery Team (PO • Dev • Test • SRE) Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Packages Environments SOURCE Monitoring QA Security Environments --- Metrics
  • 58.
    What about cross-cuttingconcerns? Cross Functional Delivery Team (PO • Dev • Test • SRE) Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Packages Environments SOURCE Monitoring QA Security Environments --- Metrics QA as a Service Security as a Service Metrics as a Service Env. as a Service
  • 59.
    What about cross-cuttingconcerns? Cross Functional Delivery Team (PO • Dev • Test • SRE) Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Packages Environments SOURCE pull pull pull pull Monitoring QA Security Environments --- Metrics QA as a Service Security as a Service Metrics as a Service Env. as a Service
  • 60.
    Be an internalservice provider pull Cross-Cutting Concern X ✓ Standardized offerings ✓ Pulled by users (not pushed) ✓ On-demand and self-service ✓ Implementation knowledge not necessary for normal use ✓ Provider spends their time building service and coaching users X as a Service
  • 61.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X
  • 62.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings
  • 63.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings
  • 64.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings 2 Tame the tool sprawl
  • 65.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings 2 Tame the tool sprawl
  • 66.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings 2 Tame the tool sprawl 3 Setup self-service interfaces
  • 67.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings 2 Tame the tool sprawl 3 Setup self-service interfaces
  • 68.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings 2 Tame the tool sprawl 3 Setup self-service interfaces 4 Setup secure access
  • 69.
    Start working likean internal service provider pull X as a Service Cross-Cutting Concern X 1 Define your offerings 2 Tame the tool sprawl 3 Setup self-service interfaces 4 Setup secure access
  • 70.
    pull X asa Service Cross-Cutting Concern X Start working like an internal service provider Plug: Give Rundeck a try --> rundeck.org 1 Define your offerings 2 Tame the tool sprawl 3 Setup self-service interfaces 4 Setup secure access
  • 71.
    What about thingsthat can’t be automated? DevOps
  • 72.
    Good rule ofthumb: Tickets are for exceptions, not the daily work X X Ticket System ?? X
  • 73.
    Good rule ofthumb: Tickets are for exceptions, not the daily work Manual request queues lead to... • Bottlenecks • Increased lead times • Reinforces organizational silos • Misinterpretation or omissions X X Ticket System ?? X
  • 74.
    How do wemitigate the negative impact of manual request queues? DevOps
  • 75.
    Use a workmanagement system like Kanban Up Next Service B Service C Service D Service E Doing Plan it Do it Review it Post Mortem Backlog prioritized by stakeholders Ta s k Task Service A Task Task Task Task Task Emergency - Type 1 Emergency - Type 2 Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task
  • 76.
    Use a workmanagement system like Kanban Up Next Service B Service C Service D Service E Doing Plan it Do it Review it Post Mortem Backlog prioritized by stakeholders Ta s k Task Service A Task Task Task Task Task Emergency - Type 1 Emergency - Type 2 Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Only works if you set and enforce: • Service catalog and backlog rules • WIP and SLA per service type • WIP per person
  • 77.
    Use a workmanagement system like Kanban Your standardized offerings Up Next Service B Service C Service D Service E Doing Plan it Do it Review it Post Mortem Backlog prioritized by stakeholders Ta s k Task Service A Task Task Task Task Task Emergency - Type 1 Emergency - Type 2 Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Only works if you set and enforce: • Service catalog and backlog rules • WIP and SLA per service type • WIP per person
  • 78.
    Use a workmanagement system like Kanban Your standardized offerings Up Next Service B Service C Service D Service E Doing Plan it Do it Review it Post Mortem Backlog prioritized by stakeholders Ta s k Task Service A Task Task Task Task Task Emergency - Type 1 Emergency - Type 2 Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Only works if you set and enforce: • Service catalog and backlog rules • WIP and SLA per service type • WIP per person SLA per service type
  • 79.
    Use a workmanagement system like Kanban Your standardized offerings Up Next Service B Service C Service D Service E Doing Plan it Do it Review it Post Mortem Backlog prioritized by stakeholders Ta s k Task Service A Task Task Task Task Task Emergency - Type 1 Emergency - Type 2 Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Task Only works if you set and enforce: • Service catalog and backlog rules • WIP and SLA per service type • WIP per person SLA per service type Enforce WIP to protect capacity and hit commitments!
  • 80.
    Unlimited Environments* (*OK, yes… nothing is ever unlimited)
  • 81.
    Unlimited Environments* •Hardware is cheap… people and opportunity costs are expensive (* OK, yes… nothing is ever unlimited)
  • 82.
    Unlimited Environments* •Hardware is cheap… people and opportunity costs are expensive • Shared integration environments become choke points (* OK, yes… nothing is ever unlimited)
  • 83.
    Unlimited Environments* •Hardware is cheap… people and opportunity costs are expensive • Shared integration environments become choke points • The more environments people have, the more experiments they run (* OK, yes… nothing is ever unlimited)
  • 84.
    Unlimited Environments* •Hardware is cheap… people and opportunity costs are expensive • Shared integration environments become choke points • The more environments people have, the more experiments they run • The more production-similar environments people have, the higher the quality of organization (* OK, yes… nothing is ever unlimited)
  • 85.
    ..But Security! ...ButCompliance! DevOps
  • 86.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE
  • 87.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Design and Code Reviews
  • 88.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation Design and Code Reviews CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Code and Binary Scanning
  • 89.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation Design and Code Reviews CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Code and Binary Scanning “Bake” security tests into your “immune system”
  • 90.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation Design and Code Reviews CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Code and Binary Scanning “Bake” security tests into your “immune system” Component vulnerability and governance
  • 91.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation Design and Code Reviews CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE Code and Binary Scanning “Bake” security tests into your “immune system” Component vulnerability and governance Access policy and operational security checks
  • 92.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE
  • 93.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE What’s the change?
  • 94.
    Security and ComplianceOpportunity Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE What’s the change? How did you validate the change?
  • 95.
    Security and ComplianceOpportunity How did you validate the change? Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE What’s the change? Where did the change go?
  • 96.
    Security and ComplianceOpportunity How did you validate the change? Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE What’s the change? Where did the change go? Who has access to what environment? Who did what when and where?
  • 97.
    Security and ComplianceOpportunity How did you validate the change? Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console What was executed on the box to make the change? Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE What’s the change? Where did the change go? Who has access to what environment? Who did what when and where?
  • 98.
    Security and ComplianceOpportunity How did you validate the change? Tests Code Source Repo Config Env Spec Run-book Auto-mation CI Server Package Repo Operations Console What was executed on the box to make the change? Shell Powershell Pre-Production Environments Shell Powershell Production Environment Packages Operations Development SOURCE What’s the change? Where did the change go? Who has access to what environment? Who did what when and where? Change things here Run / control things here
  • 99.
  • 100.
    Recap • Redrawthe org to eliminate silos!
  • 101.
    Recap • Redrawthe org to eliminate silos! • Turn information flow into artifact flow
  • 102.
    Recap • Redrawthe org to eliminate silos! • Turn information flow into artifact flow • Insert verification points to tighten feedback loops
  • 103.
    Recap • Redrawthe org to eliminate silos! • Turn information flow into artifact flow • Insert verification points to tighten feedback loops • Drive all changes through a SDLC
  • 104.
    Recap • Redrawthe org to eliminate silos! • Turn information flow into artifact flow • Insert verification points to tighten feedback loops • Drive all changes through a SDLC • Turn cross-cutting concerns into internal service providers
  • 105.
    Recap • Redrawthe org to eliminate silos! • Turn information flow into artifact flow • Insert verification points to tighten feedback loops • Drive all changes through a SDLC • Turn cross-cutting concerns into internal service providers • Strive for unlimited environments
  • 106.
  • 107.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality?
  • 108.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated?
  • 109.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated? Eliminate manual information flow and replace with tool-to-tool artifact flow?
  • 110.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated? Eliminate manual information flow and replace with tool-to-tool artifact flow? Eliminate manually-fulfilled request queues and other sources of waiting and context switching?
  • 111.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated? Eliminate manual information flow and replace with tool-to-tool artifact flow? Eliminate manually-fulfilled request queues and other sources of waiting and context switching? Improve awareness and understanding of how work is flowing of the end-to-end lifecycle?
  • 112.
    Bonus: DevOps LitmusTest 1 or more marked “NO”? Then back to the drawing board!
  • 113.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? 1 or more marked “NO”? Then back to the drawing board!
  • 114.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated? 1 or more marked “NO”? Then back to the drawing board!
  • 115.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated? Eliminate manual information flow and replace with tool-to-tool artifact flow? 1 or more marked “NO”? Then back to the drawing board!
  • 116.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated? Eliminate manual information flow and replace with tool-to-tool artifact flow? Eliminate manually-fulfilled request queues and other sources of waiting and context switching? 1 or more marked “NO”? Then back to the drawing board!
  • 117.
    Bonus: DevOps LitmusTest Reduce cycle time AND improve quality? Eliminate handoffs or reduce the friction of those handoffs that can't be eliminated? Eliminate manual information flow and replace with tool-to-tool artifact flow? Eliminate manually-fulfilled request queues and other sources of waiting and context switching? Improve awareness and understanding of how work is flowing of the end-to-end lifecycle? 1 or more marked “NO”? Then back to the drawing board!
  • 118.