3. Web / Cloud Operations
is the ability to consistently create and
deploy reliable software to an unreliable
platform that scales horizontally.
http://radar.oreilly.com/2007/10/operations-is-a-competitive-ad.html
3
11. Fundamental Attributes of Successful Cultures
1) Infrastructure as Code
2) Application as Services
3) Dev + Ops + All as Teams
12. Fundamental Attributes of Successful Cultures
1) Infrastructure as Code
2) Application as Services
3) Dev + Ops + All as Teams
Massive improvement in
“Time to Value”
13. Common Attributes of Successful Cultures
Infrastructure Application Dev / Ops / All
as Code as Services as Teams
‣ Full Stack Automation ‣ Service Orientation ‣ Shared Metrics /
‣ Commodity Hardware ‣ Lightweight Protocols Monitoring
and/or Cloud Infra ‣ Versioned APIs ‣ Incident Management
‣ Reliability in software ‣ Software Resiliency ‣ Service Owners On-call
stack (Design for Failure) ‣ Tight integration
‣ Database/Storage ‣ Continuous Integration
‣ Datacenter or Cloud ‣ Continuous
Abstraction
Infrastructure APIs Deployment
‣ Complexity pushed up
‣ Core Infra Services the stack ‣ SRE/SRO
‣ Infrastructure as ‣ Deep Instrumentation ‣ GameDay
Product
‣ App as Customer
14. Common Attributes of Successful Cultures
Infrastructure Application Dev / Ops / All
as Code as Services as Teams
‣ Full Stack Automation ‣ Service Orientation ‣ Shared Metrics /
‣ Commodity Hardware ‣ Lightweight Protocols Monitoring
and/or Cloud Infra ‣ Versioned APIs ‣ Incident Management
‣ Reliability in software ‣ Software Resiliency ‣ Service Owners On-call
stack (Design for Failure) ‣ Tight integration
‣ Database/Storage ‣ Continuous Integration
‣ Datacenter or Cloud ‣ Continuous
Abstraction
Infrastructure APIs Deployment
‣ Complexity pushed up
‣ Core Infra Services the stack ‣ SRE/SRO
‣ Infrastructure as ‣ Deep Instrumentation ‣ GameDay
Product
‣ App as Customer
15. The path organizations take...
Full
Continuous Infrastructure
Application Deployment Automation
Configuration
Common Management
Discovery and Management
Automation Tasks:
Visibility Scripts, OS
Compliance,
Updates & Patches
16. back at the office,
this may sound familiar...
20. Dear
Jesse,
I
work
for
a
big
company.
I
tried
to
talk
to
people
about
this
awesome
stuff
and
they
told
me
it
would
never
work
here.
What
do
I
do
now?
Sincerely,
Most
of
us
25. Changing Culture:
1. Start small, build trust & safety
2. Create Champions
3. Use metrics to build confidence
4. Celebrate successes
5. Exploit Compelling Events
25
26. Changing Culture:
1. Start small, build trust & safety
2. Create Champions
3. Use metrics to build confidence
4. Celebrate successes
5. Exploit Compelling Events
26
27. Example:
GameDay
Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
http://www.flickr.com/photos/dnorman/2678090600
28. define:
GameDay
An exercise designed to increase
Resilience through large-scale fault
injection across critical systems.
Part of a larger discipline called
Resilience Engineering.
See also: Chaos Monkey
30. GameDay increases Resilience in 3 ways
Preparation
‣ Identification and mitigation of risks and impact from failure
‣ Reduces frequency of failure (MTBF)
‣ Reduces duration of recovery (MTTR)
Participation
‣ Builds confidence & competence responding to failure and under
stress.
‣ Strengthens individual and cultural ability to anticipate, mitigate,
respond to, and recover from failures of all types.
Exercises
‣ Trigger and expose “latent defects”
‣ Choose when discover them, instead of letting that be determined by
the next real disaster.
40. Hacks:Creating Champions
1. Get executive sponsors, starting
with your boss.
2. Give everyone else the credit.
3. Give “Special Status”
4. Have people with “Special Status”
talk about the new awesome.
40
41. Hacks: Metrics
1. Find KPI that supports change
2. Track and use it ruthlessly - first
to show value, later cost of not
making the change by laggards
3. Tell your story with data
41
42. Hacks: Celebrating successes
1. Tell a powerful story
2. Always positive about people and how
they overcame a problem.
3. Never about people who created the
problem.
4. Leave room for people to come to
your side. (don’t fight stupid ;-)
42
43. Hacks:Compelling Events
1. Just wait, it will come
2. Can be created by things like
compliance, scaling, cloud
migrations
3. Not “I told you so” - but “what do
we do now”
43
45. Common Attributes of Successful Cultures
Infrastructure Application Dev / Ops / All
as Code as Services as Teams
‣ Full Stack Automation ‣ Service Orientation ‣ Shared Metrics /
‣ Commodity Hardware ‣ Lightweight Protocols Monitoring
and/or Cloud Infra ‣ Versioned APIs ‣ Incident Management
‣ Reliability in software ‣ Software Resiliency ‣ Service Owners On-call
stack (Design for Failure) ‣ Tight integration
‣ Database/Storage ‣ Continuous Integration
‣ Datacenter or Cloud ‣ Continuous
Abstraction
Infrastructure APIs Deployment
‣ Complexity pushed up
‣ Core Infra Services the stack ‣ SRE/SRO
‣ Infrastructure as ‣ Deep Instrumentation ‣ GameDay
Product
‣ App as Customer