A NON-TECHNICAL TALK
ABOUT IT CULTURE IN
DEVOPS
WHO I AM
• Name: Jon Hildebrand
• Contact: @snoopj123 on Twitter
• Location: Kansas City, Missouri, USA
• Day Job: Formerly a Cloud Architect for a US-
Midwest based managed service provider
• Free Time: Trying to find a baseball game to
watch, raising a four-year-old, blogging, and being
a student of DevOps
• Accolades include: VMware vExpert (Five years),
Cisco Champion (Four years), NetApp United (Two
years), recurring Tech Field Day delegate
WHAT WENT SO WRONG WITH
IT AND CULTURE?
• Ever hear of Frederick Nietzsche?
• “The Birth of Tragedy from the Spirit of Music”
• The dichotomy Apollonian and Dionysian thought with regards
to ancient philosophers
• Apollonian thought being rational and reasoned
• Dionysian thought being more irrational and chaotic
• Over time, in IT, we’ve applied multiple amounts of rationality
and reason to our processes
• Unfortunately, we forget about the chaotic nature of the
system we try to control
THE GOLDEN TRIANGLE
• The constant struggle for IT
organizations world wide is effectively
managing this triangle
• Pay too much attention to one of the
areas and the other two will suffer
• Too often, many problematic to
failing IT organizations are too
focused the technology angle
THE TRUTH OF THE MATTER,
BROUGHT TO YOU BY OPENSTACK SUMMIT 2016
<insert tech name>
CAMS
• The four main pillars of DevOps
• Culture
• Automation
• Measurement
• Sharing
• Has been known to include L to
represent Lean software
development practices (CALMS)
THE PHOENIX PROJECT AND THE
THREE WAYS
• The First Way – Systems Thinking
• The Second Way – Amplify Feedback Loops
• The Third Way – Create a culture of continuous
learning and experimentation
CULTURE IN DEVOPS!
• Often overlooked by those trying to make the initial transition
• Focus more on the business goals rather than on technology fiefdoms
• Requires many behavior changes, down to an individual level
• Without changes to behaviors, a culture may never be properly established, dooming the DevOps
movement to failure before even getting to the technology
“You can’t directly change culture. But you can change behavior, and behavior becomes culture.”
- Lloyd Taylor, VP Infrastructure, Ngmoco
ORGANIZATIONAL CONTROLS
Organizations tend to put methods of controls over various aspects of what we call our day-to-day work.
Most organizations will fall under three categories:
• Pathological
• Bureaucratic
• Generative
YOUR CURRENT SITUATION
Pathological Organizations Bureaucratic Organizations Generative Organizations
Information Handling Information is hidden Information may be ignored Information is actively sought
Messenger Handling “Shoot the messenger” Messengers are tolerated Messengers are trained
Responsibilities Shirking of responsibilities Compartmentalized
responsibilities
Responsibilities are shared
Team Bridging Highly discouraged Allowed, but generally
discouraged
Rewarded
Failure Handling Failure is covered up Treated with either mercy or
punishment
Causes inquiry
New Idea Handling New ideas are crushed New ideas create problems New ideas welcomed
Reflect on your own career…
I’ll bet that many have experienced the first two types of organizations…
Source: Dr. Ron Westrum, “A typology of organisation culture”, BMJ Quality and Safety 13
TO ERR IS HUMAN
• We all make mistakes...
• However, most of the time, our
mistakes in our day-to-day work
can likely be traced back not to
the last action that lead to failure,
but to the system that lead to the
action
“Human error is not our cause for troubles; instead, human error is a
consequence of the design of the tools that we gave them.” - Dr. Sidney Dekker
THE BLAME
CULTURE EPIDEMIC
• One of the worst tools an
organization can use when handling
failure is resorting to “Name, Blame,
and Shame”
• Human nature is to immediately
blame, however, blame is the least
productive reaction in a crisis
• Unfortunately, chronic use of this as a
practice usually leads to Fear
”
““Our work is almost always performed within a complex system and how management
chooses to react to failures and accidents leads to a culture of fear, which then makes it
unlikely that problems and failure signals are ever reported. The result is that problems
remain hidden until a catastrophe occurs.” – Excerpt from “The DevOps Handbook”
“When response to incidents and accidents are seen as unjust, it can impede safety
investigations, promoting fear rather than mindfulness in people who do safety-critical
work, making organizations more bureaucratic rather than more careful, and
cultivating professional secrecy, evasion, and self-protection.” – Excerpt from “Just
Culture: Restoring Trust and Accountability in Your Organization”, by Dr. Sidney Dekker
WHY FAILURE IS A GOOD THING
“By removing blame, you remove fear; by removing fear, you enable honesty; and honesty enables prevention.” –
Bethany Macri, Engineer @ Etsy
• DevOps thrives on experimenting; experimentation means failures WILL occur
• Used to make better processes
• This leads to the concept of organization learning
• Organization learning is a process of creating, retaining, and transferring information within an organization
• Doing so will lead to many personal benefits that greatly benefit the organization
• You become more self-diagnosing and self-improving
• You become more skilled at detecting and solving problems
DEVOPS PRACTICES TO CREATE A LEARNING-BASED CULTURE
• Blameless post-mortems
• Honest question, how many of you are actually doing post-mortems after events?
• The idea here is to get to the inquiry about the failure, rather than cast the ”name, blame, shame” net upon
who did the work in question
• Also great opportunity to gather useful information for the entire organization and published for all to see
• These releases tend to help in reestablishing trust and accountability to customers impacted by said event
• Controlled Introduction of Failures
• Creates plenty of opportunity for practice/mastery (a major core DevOps principle)
“Coping, fire fighting, and making do were gradually replaced through the organization by a dynamic of
identifying opportunities for process and product improvement. As those opportunities were identified and
the problems were investigated, the pockets of ignorance that they reflected were converted into nuggets
of knowledge.” – Dr. Steven Spear
AMAZON S3 “OUTAGE” – A POST-MORTEM EXAMPLE
• Interesting usage of terms
• Amazon “team member” only mentioned a single time
• The term “process” was used four times
• Half of post-mortem devoted to upcoming changes to
operational procedures
• A quarter of the document was devoted to describing
the growing pains of S3 and a high-level resiliency
design of S3 and how it’s recovery took longer than
expected
• The last line of this document states: “We will do
everything we can to learn from this event and use it
to improve our availability even further.”
BRITISH AIRWAYS – PROVIDING AN EXAMPLE OF WHAT NOT TO DO
• On May 27th, 2017, British Airways suffered a major systems
outage resulting in many grounded flights during a holiday
weekend
• The outage is estimated to have cost BA nearly $112 million
(US$)
• BA seems intent on placing the blame for the outage on a
single engineer
• That single engineer, while authorized to be in the position
they were, is being reported as “not authorized to do what he
did.”
• There has been criticism of BA’s handling of the blame when
the question of how a single engineer could cause this much
of a problem with a single event
• While there is a partial amount of blame to be placed on the
engineer, BA is not at faultless here
• Backup systems failed miserable. This is a design and testing
failure of a major system.
DEATH TO THE HERO
• Everyone loves the hero...
• You can easily identify the ”hero” of the
organization by finding one of the most burned
out persons
• They generally work extremely long hours, are
usually single-handedly troubleshooting issues,
and fire fighting to keep services up and
running.
• Generally doing this solo
• No shared burden (DevOps core principle is
about shared responsibilities)
• Promotes a ton of personal gain at the expense
of team effectiveness “A culture that rewards firefighting breeds arsonists.”
THE TEAM
• It can’t be stressed enough that DevOps
culture also promotes many intra-team and
inter-team interactions
• Collaboration and affinity are big keys to
highly successful DevOps cultures
• Affinity – Process of building inter-team
relationships, navigating differing goals and
metrics while keeping in mind shared
organizational goals and fostering empathy
and learning between different groups of
people
• Do not forget, that you still have to be able
to work with your own team first before
working with others“That’s a team, gentlemen, and either, we heal,
now, as a team, or we will die as individuals.” –
Tony D’Amato, Any Given Sunday
KEY TAKEAWAYS
• Before you start with DevOps and technology, realize that an entire organization needs to
change it’s culture, all the way down to how team members interact
• Without a culture change, most DevOps initiatives are doomed to immediate failure
• While culture change needs to be an organizational imperative, the individuals are ultimately
responsible for the behavioral changes need for cultural success
• Also, just because DevOps instills this sort of culture doesn’t mean non-development IT
organizations can’t either. Many of these points are rather universal across most practices
RECOMMENDED
PUBLICATIONS
• Highly recommend “The Goal” as a lead
in into “The Phoenix Project”. You’ll
recognize many points from “The Goal”
in “The Phoenix Project”
• Also, goes without saying since I’m
following Gene Kim, that ”The DevOps
Handbook” is another great read about
things leading up to the technical side of
DevOps
• Lastly, I really enjoyed “Effective DevOps”
as it had plenty of focus on the
managerial side of DevOps teams. You
can find plenty of more information
about topics not discussed or deeper
discussion on silos, effective
communication, and developing team
trust
THANK YOU!

VMUG UserCon Presentation for 2018

  • 1.
    A NON-TECHNICAL TALK ABOUTIT CULTURE IN DEVOPS
  • 2.
    WHO I AM •Name: Jon Hildebrand • Contact: @snoopj123 on Twitter • Location: Kansas City, Missouri, USA • Day Job: Formerly a Cloud Architect for a US- Midwest based managed service provider • Free Time: Trying to find a baseball game to watch, raising a four-year-old, blogging, and being a student of DevOps • Accolades include: VMware vExpert (Five years), Cisco Champion (Four years), NetApp United (Two years), recurring Tech Field Day delegate
  • 3.
    WHAT WENT SOWRONG WITH IT AND CULTURE? • Ever hear of Frederick Nietzsche? • “The Birth of Tragedy from the Spirit of Music” • The dichotomy Apollonian and Dionysian thought with regards to ancient philosophers • Apollonian thought being rational and reasoned • Dionysian thought being more irrational and chaotic • Over time, in IT, we’ve applied multiple amounts of rationality and reason to our processes • Unfortunately, we forget about the chaotic nature of the system we try to control
  • 4.
    THE GOLDEN TRIANGLE •The constant struggle for IT organizations world wide is effectively managing this triangle • Pay too much attention to one of the areas and the other two will suffer • Too often, many problematic to failing IT organizations are too focused the technology angle
  • 5.
    THE TRUTH OFTHE MATTER, BROUGHT TO YOU BY OPENSTACK SUMMIT 2016 <insert tech name>
  • 6.
    CAMS • The fourmain pillars of DevOps • Culture • Automation • Measurement • Sharing • Has been known to include L to represent Lean software development practices (CALMS)
  • 7.
    THE PHOENIX PROJECTAND THE THREE WAYS • The First Way – Systems Thinking • The Second Way – Amplify Feedback Loops • The Third Way – Create a culture of continuous learning and experimentation
  • 8.
    CULTURE IN DEVOPS! •Often overlooked by those trying to make the initial transition • Focus more on the business goals rather than on technology fiefdoms • Requires many behavior changes, down to an individual level • Without changes to behaviors, a culture may never be properly established, dooming the DevOps movement to failure before even getting to the technology “You can’t directly change culture. But you can change behavior, and behavior becomes culture.” - Lloyd Taylor, VP Infrastructure, Ngmoco
  • 9.
    ORGANIZATIONAL CONTROLS Organizations tendto put methods of controls over various aspects of what we call our day-to-day work. Most organizations will fall under three categories: • Pathological • Bureaucratic • Generative
  • 10.
    YOUR CURRENT SITUATION PathologicalOrganizations Bureaucratic Organizations Generative Organizations Information Handling Information is hidden Information may be ignored Information is actively sought Messenger Handling “Shoot the messenger” Messengers are tolerated Messengers are trained Responsibilities Shirking of responsibilities Compartmentalized responsibilities Responsibilities are shared Team Bridging Highly discouraged Allowed, but generally discouraged Rewarded Failure Handling Failure is covered up Treated with either mercy or punishment Causes inquiry New Idea Handling New ideas are crushed New ideas create problems New ideas welcomed Reflect on your own career… I’ll bet that many have experienced the first two types of organizations… Source: Dr. Ron Westrum, “A typology of organisation culture”, BMJ Quality and Safety 13
  • 11.
    TO ERR ISHUMAN • We all make mistakes... • However, most of the time, our mistakes in our day-to-day work can likely be traced back not to the last action that lead to failure, but to the system that lead to the action “Human error is not our cause for troubles; instead, human error is a consequence of the design of the tools that we gave them.” - Dr. Sidney Dekker
  • 12.
    THE BLAME CULTURE EPIDEMIC •One of the worst tools an organization can use when handling failure is resorting to “Name, Blame, and Shame” • Human nature is to immediately blame, however, blame is the least productive reaction in a crisis • Unfortunately, chronic use of this as a practice usually leads to Fear
  • 13.
    ” ““Our work isalmost always performed within a complex system and how management chooses to react to failures and accidents leads to a culture of fear, which then makes it unlikely that problems and failure signals are ever reported. The result is that problems remain hidden until a catastrophe occurs.” – Excerpt from “The DevOps Handbook” “When response to incidents and accidents are seen as unjust, it can impede safety investigations, promoting fear rather than mindfulness in people who do safety-critical work, making organizations more bureaucratic rather than more careful, and cultivating professional secrecy, evasion, and self-protection.” – Excerpt from “Just Culture: Restoring Trust and Accountability in Your Organization”, by Dr. Sidney Dekker
  • 14.
    WHY FAILURE ISA GOOD THING “By removing blame, you remove fear; by removing fear, you enable honesty; and honesty enables prevention.” – Bethany Macri, Engineer @ Etsy • DevOps thrives on experimenting; experimentation means failures WILL occur • Used to make better processes • This leads to the concept of organization learning • Organization learning is a process of creating, retaining, and transferring information within an organization • Doing so will lead to many personal benefits that greatly benefit the organization • You become more self-diagnosing and self-improving • You become more skilled at detecting and solving problems
  • 15.
    DEVOPS PRACTICES TOCREATE A LEARNING-BASED CULTURE • Blameless post-mortems • Honest question, how many of you are actually doing post-mortems after events? • The idea here is to get to the inquiry about the failure, rather than cast the ”name, blame, shame” net upon who did the work in question • Also great opportunity to gather useful information for the entire organization and published for all to see • These releases tend to help in reestablishing trust and accountability to customers impacted by said event • Controlled Introduction of Failures • Creates plenty of opportunity for practice/mastery (a major core DevOps principle) “Coping, fire fighting, and making do were gradually replaced through the organization by a dynamic of identifying opportunities for process and product improvement. As those opportunities were identified and the problems were investigated, the pockets of ignorance that they reflected were converted into nuggets of knowledge.” – Dr. Steven Spear
  • 16.
    AMAZON S3 “OUTAGE”– A POST-MORTEM EXAMPLE • Interesting usage of terms • Amazon “team member” only mentioned a single time • The term “process” was used four times • Half of post-mortem devoted to upcoming changes to operational procedures • A quarter of the document was devoted to describing the growing pains of S3 and a high-level resiliency design of S3 and how it’s recovery took longer than expected • The last line of this document states: “We will do everything we can to learn from this event and use it to improve our availability even further.”
  • 17.
    BRITISH AIRWAYS –PROVIDING AN EXAMPLE OF WHAT NOT TO DO • On May 27th, 2017, British Airways suffered a major systems outage resulting in many grounded flights during a holiday weekend • The outage is estimated to have cost BA nearly $112 million (US$) • BA seems intent on placing the blame for the outage on a single engineer • That single engineer, while authorized to be in the position they were, is being reported as “not authorized to do what he did.” • There has been criticism of BA’s handling of the blame when the question of how a single engineer could cause this much of a problem with a single event • While there is a partial amount of blame to be placed on the engineer, BA is not at faultless here • Backup systems failed miserable. This is a design and testing failure of a major system.
  • 18.
    DEATH TO THEHERO • Everyone loves the hero... • You can easily identify the ”hero” of the organization by finding one of the most burned out persons • They generally work extremely long hours, are usually single-handedly troubleshooting issues, and fire fighting to keep services up and running. • Generally doing this solo • No shared burden (DevOps core principle is about shared responsibilities) • Promotes a ton of personal gain at the expense of team effectiveness “A culture that rewards firefighting breeds arsonists.”
  • 19.
    THE TEAM • Itcan’t be stressed enough that DevOps culture also promotes many intra-team and inter-team interactions • Collaboration and affinity are big keys to highly successful DevOps cultures • Affinity – Process of building inter-team relationships, navigating differing goals and metrics while keeping in mind shared organizational goals and fostering empathy and learning between different groups of people • Do not forget, that you still have to be able to work with your own team first before working with others“That’s a team, gentlemen, and either, we heal, now, as a team, or we will die as individuals.” – Tony D’Amato, Any Given Sunday
  • 20.
    KEY TAKEAWAYS • Beforeyou start with DevOps and technology, realize that an entire organization needs to change it’s culture, all the way down to how team members interact • Without a culture change, most DevOps initiatives are doomed to immediate failure • While culture change needs to be an organizational imperative, the individuals are ultimately responsible for the behavioral changes need for cultural success • Also, just because DevOps instills this sort of culture doesn’t mean non-development IT organizations can’t either. Many of these points are rather universal across most practices
  • 21.
    RECOMMENDED PUBLICATIONS • Highly recommend“The Goal” as a lead in into “The Phoenix Project”. You’ll recognize many points from “The Goal” in “The Phoenix Project” • Also, goes without saying since I’m following Gene Kim, that ”The DevOps Handbook” is another great read about things leading up to the technical side of DevOps • Lastly, I really enjoyed “Effective DevOps” as it had plenty of focus on the managerial side of DevOps teams. You can find plenty of more information about topics not discussed or deeper discussion on silos, effective communication, and developing team trust
  • 22.