JASON HAND |
DevOps Evangelist
• Holds over 15 years of experience as
a developer, system administrator,
and support specialist
• Fully emerged into the world of agile
development and the DevOps
movement with Colorado tech
startups
#DevOpsRoadTrip
#DevOpsRoadtrip
#DevOpsRoadTrip
A little about VictorOps…
VictorOps is the real-time incident
management platform that combines the
power of people and data to embolden
DevOps pros to handle incidents as they
occur.
#DevOpsRoadTrip
Why Are
We Here?
Culture
Culture
“How Organizations Process Information”
Roy Westrum: A Typology of Organizational Cultures
2014 State of DevOps Report shows that in the context of IT, job satisfaction is the biggest predictor of
profitability, market share, and productivity. The biggest predictor of job satisfaction, in turn, is how
effectively organizations process information, as determined by a model created by sociologist Ron
Westrum, shown below. 1
1: https://continuousdelivery.com/implementing/culture/
Words are how we think – stories are how we link.
- Christina Baldwin
Oral narrative is and for a long time has been the
chief basis of culture itself.
Stories from the road
Stable Systems
Cynefin
Un-ordered Ordered
Complicated
Obvious
Complex
Chaotic
Cause Effect Obvious
From Experience
Cause Effect Requires
Analysis
Cause Effect Only
Apparent in Hindsight
Cause & Effect Cannot
Be Related
Sense – Categorize -
Respond
Sense – Analyze -
Respond
Probe – Sense -
Respond
Act – Sense -
Respond
The systems we engineer, maintain, and improve are
Complicated
.. or ..
Known unknowns
The systems we engineer, maintain, and improve are
ComplexUnknown unknowns
What is the
Root
Cause?
What are the..
Contributing
Factors?
Identifying a “root cause” helps us to …
Put it back
how it was
What we really want is to..
Continuously
Improve
TimeToRepair(TTR)
Continuous Improvement Efforts
Reactive
(chaotic)
Tactical
(obvious)
Integrated
(complicated)
Strategic
(complex)
✓ No automation
✓ No operational stack
awareness
✓ Poor collaboration between
teams (Dev & Ops)
✓ Documentation not available
✓ No standardized
communication
✓ High focus on consistent
continuous learning ✓ Uses a NOC
✓ Some monitoring & alerting
instrumentation
✓ Collaboration in crisis
✓ "Mission critical" processes are
available
✓ Understood crisis
communication protocols
✓ Remediation data available to
IT Operations ✓ Team rotations, paging
policies, role hunting
✓ Continuous improvement of
key health indicators
✓ Technical collaboration across
all incidents
✓ Docs up to date and easily
accessible
✓ Consistent real-time
communication practices
✓ Automated docs and remediation
✓ Actionable Alerts with full context
✓ High collaboration among all teams
✓ Documentation part of remediation
✓ Targeted, proactive crisis comms
✓ High focus on continuous learning
Incident Management
Maturity
Reactive
(chaotic)
✓No automation
✓No operational stack awareness
✓Poor collaboration between teams (Dev & Ops)
✓Documentation not available
✓No standardized communication
✓High focus on consistent continuous learning
Tactical
(obvious)
✓Uses a NOC
✓Some monitoring & alerting instrumentation
✓Collaboration in crisis
✓"Mission critical" processes are available
✓Understood crisis communication protocols
✓Remediation data available to IT Operations
Integrated
(complicated)
✓Team rotations, paging policies, role hunting
✓Continuous improvement of key health indicators
✓Technical collaboration across all incidents
✓Docs up to date and easily accessible
✓Consistent real-time communication practices
Strategic
(complex)
✓Automated docs and remediation
✓Actionable Alerts with full context
✓High collaboration among all teams
✓Documentation part of remediation
✓Targeted, proactive crisis comms
✓High focus on continuous learning
“Six Trends Shape DevOps Adoption, Q1 2015”
Forrester report
• The Foundation For Success Is In Place . . . Mostly
• Fear Of Failure Will Hamper Advancement
• Monitoring And Analytics Strategies Must Make A Big Leap Forward
• The Focus On Customer Experience Is Not Second Nature . . . Yet
• Change And Release Processes Are Not Delivering Business Needs
• You Must Prioritize And Focus Sourcing Strategies
Automation
Awareness
Collaboration
Documentation User Empathy
Learning
Learning
Failure not seen as opportunity to learn
Source: “Six Trends Shape DevOps Adoption, Q1 2015”, Forrester report
Awareness
http://blog.vmware.com
© 2015 Forrester Research, Inc. Reproduction Prohibited 47
Single Source Of Truth Lacking In Many Orgs
– 95% only most of the time or less
Source: April 15, 2015 “Six Trends That Will Shape DevOps Adoption”, Forrester report
Collaboration
Teams siloed throughout life cycle
Source: “Six Trends Shape DevOps Adoption, Q1 2015”, Forrester report
User Empathy
https://open.buffer.com/wp-content/uploads/2015/12/empathy3.jpg
© 2015 Forrester Research, Inc. Reproduction Prohibited 51
IT teams aren’t measured on customer
experience goals.
Automation
http://thelifedesignproject.com/wp-content/uploads/2009/09/373881476_217d24ef6d.jpg
Delays in notifications Leads To Customers
Finding the Problem First
Source: “Six Trends Shape DevOps Adoption, Q1 2015”, Forrester report
Documentation
http://blog.vmware.com
Reduce MTTR
State of DevOps Report (2015)
– by Puppet Labs
Automation
Awareness
Collaboration
Documentation User Empathy
Learning
jhand.co/DRT_SF
GUILLAUME BINET |
TECH LEAD, SENIOR SOFTWARE ENGINEER
• At Google since 2013, and is working on the Signals team for
Google Cloud.
• Previously held responsibilities at startups in Europe and the
US, including an API Tech Lead at Twilio
• Author of Errbot (http://errbot.io)
• When note working or hacking on Errbot, you can find him car
racing, scuba diving or tinkering on his latest electronic project
#DevOpsRoadTrip
Chatops ?
Guillaume Binet
2016 VictorOps Devops Roadtrip SFO
Hosted
by
Chat + DevOps
67
w00t! let’s roll !
Build #423 started...
This is getting exciting ! The press release is ready.
Build #423 finished. Triggering tests on image avengers-v2-
423.
avengers-v2-423: all tests are green.
!build avengers-v2
ok, I just pushed the latest touches on branch avengers-v2
68
$ nohup python r2d2.py &
Sprinting
Dependency graph
>!issue deps 1284
1284 - User Story
App skins
Assignee: gbin
1285 - Feature
Skin dls
Assignee: gbin
1289 - Feature
Android skins
Assignee: stevo
1287 - Feature
Skin console
Assignee: gbin
Building &
Deploying
Continuous integration
Mars-4.4.123 passed all tests.
Releases
>!qa 4.4.123
Deploying Mars-4.4.123 on qa...
Monitoring
Alerts
CRITICAL: On prod, trebuchet-3
is at 80% CPU.
End to end test
Test call failed 3 times in a
row on prod.
Fun !
>!facepalm
>!tourney
>!devops borat
>!ask gbin
[...]
Non stop
breakages
"This thing is
critical!"
O. Dauby, Ops Manager
http://errbot.io
Easy to start with ...
83
Hot changes
resilient
Chatops
on
Chatops
Accessible to non-
programmers
… but very powerful !
84
Conversations
Automation
Filter /
security
Persistence
Templates Provisioning
Make it your
own !
http://errbot.io
Thanks !
SEAN FITZGERALD |
SOFTWARE ENGINEER, SNAPCHAT
• (seanfitz) is a Software Engineer focused on infrastructure at
Snapchat.
• Career ranging from startups as small 5 to a tour of duty at
Amazon.
#DevOpsRoadTrip
Devops at Snapchat
VictorOps Roadtrip SF, 2016
Ownership
Writing Code
Deploying Code
Monitoring Code
Finding Root Cause
AARON MERRILL |
NOC MANAGER, SONY INTERACTIVE ENTERTAINMENT – PLAYSTATION
• Process leader and evangelist with 20 years worth of
experience leveraging technical expertise to improve
operations efficiency and reliability, while reducing operational
costs
• Working with Fortune 500 companies including NewsCorp,
Fidelity, Sempra Energy and Sony
• Effective in reducing time-to-resolution of priority incidents by
double digit percentages through enhanced technologies and
improved process design.
#DevOpsRoadTrip
96
24X7 Monitoring
Around the clock monitoring
ensuring the availability of
IT services supporting the
advancement of Sony
Interactive Entertainment -
WWS
Service Restoration
Expert outage triage and
high impacting incident
identification leading to
rapid action and resolution
Notification
Timely notification to our
WWS clients and
assembling the appropriate
resources to drive
resolutions as quickly as
possible
Problem Management
Driving the documentation
and root cause analysis on
all high impacting incidents -
ensuring resolution and
proactive prevention of
future events
Maintenance
Providing skilled, flexible,
client focused first-level
game and IT service
maintenance 24 hours a day
Reporting
Consistent accurate
reporting on incident and
performance metrics
Detailed post-mortem
documentation on all high
impacting outages
❖ Increasing cost of toolsets
❖ Time consuming activities that lend themselves to automation
❖ Need for better integration with current monitoring systems
❖ Better context around alerts from disparate systems
❖ Better reporting - MTTA, MTTR, and incident event chronologies
❖ Managing incident notification subscriptions
❖ Faster means of communicating during incident
CHALLENGES
❖ Re-routing incidents to one or more teams
❖ Automated ticket creation
❖ Ability to quiet un-actionable or false alarms
❖ IT communication during incident (chat, automated conference
calling, Slack integration)
❖ Providing better contextual information (triage docs &
runbooks) along with alerts to quickly solve the incidents
❖ Real-time on-call handoffs and schedule overrides
❖ Continuous incident documentation for post-mortem reporting
for RCA creation
❖ Reporting on MTTA, MTTR, and incident event chronologies
SOLUTION
99
SOLUTION - INTEGRATION
❖ Integrated with VictorOps
❖ Self service subscription based communications (SMS /
Email)
❖ Scheduled maintenance with reminders
❖ Automatically displays the status of Infrastructure providers
(Box, Amazon, etc.)
❖ Provide users incident post mortems reports
❖ Integrated translation services
SOLUTION
101
SOLUTION
❖ Enhanced real-time communications to our internal clients
❖ Further tool integrations and reporting
❖ Providing enhanced contextual information along with
alerts to quickly solve the incidents at hand
❖ Correlation of change, incidents and problems
❖ Enhanced knowledge articles and runbooks
❖ Translation in Slack channels for real-time communication
Ongoing improvements
Questions
Q&A
BREAK TIME
#DevOpsRoadTrip
Breakout Sessions
◻ ChatOps - Guillaume Binet, Tech Lead/Senior Software Engineer, Google
◻ Security & Compliance in a DevOps World - J. Paul Reed, DevOps Consultant
◻ Finding Signal in the Noise – Aneel Lakhani, SignalFX
◻ DevOps Culture and Burnout – Ken Mugrage, Tech Evangelist, Thoughtworks
◻ Straddling Deployment and Operations – Aaron Merrill, NOC Manager
◻ DevOps Unlocked – Social Contracts and Code – Sean Fitzgerald, Software Engineer at
Snapchat
#DevOpsRoadTrip
KEN MUGRAGE |
TECHNICAL EVANGELIST, THOUGHTWORKS
• 25 years experience in IT
• Focused on Continuous Delivery and DevOps for much of the
past decade.
• Worked with organizations all over the world, ranging from
startups to Fortune 500 companies.
• Passionate about helping others get better at building, testing
and deploying software and using technology to increase
business effectiveness, as opposed to using the “latest cool
thing”
#DevOpsRoadTrip
ANEEL LAKHANI |
Signal FX
• In technology full-time since high school – from startups to
consulting to teaching to big tech companies to analyst-ing and
back to startups.
• Former research director at Gartner
#DevOpsRoadTrip
J. PAUL REED |
DEVOPS CONSULTANT
• Over a decade of experience in the trenches as a build/release and tools
engineer, working with such organizations as VMware, Mozilla, and
Symantec.
• In 2012, he founded Release Engineering Approaches, a consultancy
incorporating a host of tools and techniques to help organizations “Simply
Ship. Every time.”
• Worked with organizations across a number of industries, from financial
services to cloud-based infrastructure, with teams from 2 to 200.
• Paul is also a founding host of The Ship Show, a twice-monthly podcast
tackling topics related to build engineering, DevOps, and release
management.
#DevOpsRoadTrip
Breakout Sessions
◻ ChatOps - Guillaume Binet, Tech Lead/Senior Software Engineer, Google
◻ Security & Compliance in a DevOps World - J. Paul Reed, DevOps Consultant
◻ Finding Signal in the Noise – Aneel Lakhani, SignalFX
◻ DevOps Culture and Burnout – Ken Mugrage, Tech Evangelist, Thoughtworks
◻ Straddling Deployment and Operations – Aaron Merrill, NOC Manager
◻ DevOps Unlocked – Social Contracts and Code – Sean Fitzgerald, Software Engineer at
Snapchat
#DevOpsRoadTrip
BREAK TIME
#DevOpsRoadTrip
J. PAUL REED |
DEVOPS CONSULTANT
• Over a decade of experience in the trenches as a build/release and tools
engineer, working with such organizations as VMware, Mozilla, and
Symantec.
• In 2012, he founded Release Engineering Approaches, a consultancy
incorporating a host of tools and techniques to help organizations “Simply
Ship. Every time.”
• Worked with organizations across a number of industries, from financial
services to cloud-based infrastructure, with teams from 2 to 200.
• Paul is also a founding host of The Ship Show, a twice-monthly podcast
tackling topics related to build engineering, DevOps, and release
management.
#DevOpsRoadTrip
Q&A
How did you
Score?
How Organizations Process Information
Roy Westrum: A Typology of Organizational Cultures
2014 State of DevOps Report shows that in the context of IT, job satisfaction is the biggest predictor of
profitability, market share, and productivity. The biggest predictor of job satisfaction, in turn, is how
effectively organizations process information, as determined by a model created by sociologist Ron
Westrum, shown below. 1
1: https://continuousdelivery.com/implementing/culture/
TimeToRepair(TTR)
Continuous Improvement Efforts
Reactive (0 – 4)
(chaotic)
Tactical (5 – 9)
(obvious)
Integrated (10 -14)
(complicated)
Strategic (15 – 18)
(complex)
✓ No automation
✓ No operational stack
awareness
✓ Poor collaboration between
teams (Dev & Ops)
✓ Documentation not available
✓ No standardized
communication
✓ High focus on consistent
continuous learning ✓ Uses a NOC
✓ Some monitoring & alerting
instrumentation
✓ Collaboration in crisis
✓ "Mission critical" processes are
available
✓ Understood crisis
communication protocols
✓ Remediation data available to
IT Operations ✓ Team rotations, paging
policies, role hunting
✓ Continuous improvement of
key health indicators
✓ Technical collaboration across
all incidents
✓ Docs up to date and easily
accessible
✓ Consistent real-time
communication practices
✓ Automated docs and remediation
✓ Actionable Alerts with full context
✓ High collaboration among all teams
✓ Documentation part of remediation
✓ Targeted, proactive crisis comms
✓ High focus on continuous learning
Incident Management Maturity
TO POPULATE WITH REAL-TIME RESULTS
RAFFLE TIME
#DevOpsRoadTrip
Join us for Happy Hour
Co-Sponsored by
• Your nametag is your ticket
• Down the elevator, turn towards Spear Street, and you’ll see it
on your right.
• One Market Street Restaurant
#DevOpsRoadTrip
DENVER - SEATTLE - SAN FRANCISCO - MINNEAPOLIS - NEW YORK CITY

DevOpsRoadTrip San Francisco Final Speaking Deck

  • 2.
    JASON HAND | DevOpsEvangelist • Holds over 15 years of experience as a developer, system administrator, and support specialist • Fully emerged into the world of agile development and the DevOps movement with Colorado tech startups #DevOpsRoadTrip
  • 3.
  • 6.
    A little aboutVictorOps… VictorOps is the real-time incident management platform that combines the power of people and data to embolden DevOps pros to handle incidents as they occur. #DevOpsRoadTrip
  • 8.
  • 14.
  • 19.
  • 20.
    “How Organizations ProcessInformation” Roy Westrum: A Typology of Organizational Cultures 2014 State of DevOps Report shows that in the context of IT, job satisfaction is the biggest predictor of profitability, market share, and productivity. The biggest predictor of job satisfaction, in turn, is how effectively organizations process information, as determined by a model created by sociologist Ron Westrum, shown below. 1 1: https://continuousdelivery.com/implementing/culture/
  • 23.
    Words are howwe think – stories are how we link. - Christina Baldwin Oral narrative is and for a long time has been the chief basis of culture itself. Stories from the road
  • 26.
  • 27.
  • 28.
    Un-ordered Ordered Complicated Obvious Complex Chaotic Cause EffectObvious From Experience Cause Effect Requires Analysis Cause Effect Only Apparent in Hindsight Cause & Effect Cannot Be Related Sense – Categorize - Respond Sense – Analyze - Respond Probe – Sense - Respond Act – Sense - Respond
  • 30.
    The systems weengineer, maintain, and improve are Complicated .. or .. Known unknowns
  • 31.
    The systems weengineer, maintain, and improve are ComplexUnknown unknowns
  • 33.
  • 34.
  • 35.
    Identifying a “rootcause” helps us to … Put it back how it was
  • 36.
    What we reallywant is to.. Continuously Improve
  • 37.
    TimeToRepair(TTR) Continuous Improvement Efforts Reactive (chaotic) Tactical (obvious) Integrated (complicated) Strategic (complex) ✓No automation ✓ No operational stack awareness ✓ Poor collaboration between teams (Dev & Ops) ✓ Documentation not available ✓ No standardized communication ✓ High focus on consistent continuous learning ✓ Uses a NOC ✓ Some monitoring & alerting instrumentation ✓ Collaboration in crisis ✓ "Mission critical" processes are available ✓ Understood crisis communication protocols ✓ Remediation data available to IT Operations ✓ Team rotations, paging policies, role hunting ✓ Continuous improvement of key health indicators ✓ Technical collaboration across all incidents ✓ Docs up to date and easily accessible ✓ Consistent real-time communication practices ✓ Automated docs and remediation ✓ Actionable Alerts with full context ✓ High collaboration among all teams ✓ Documentation part of remediation ✓ Targeted, proactive crisis comms ✓ High focus on continuous learning Incident Management Maturity
  • 38.
    Reactive (chaotic) ✓No automation ✓No operationalstack awareness ✓Poor collaboration between teams (Dev & Ops) ✓Documentation not available ✓No standardized communication ✓High focus on consistent continuous learning
  • 39.
    Tactical (obvious) ✓Uses a NOC ✓Somemonitoring & alerting instrumentation ✓Collaboration in crisis ✓"Mission critical" processes are available ✓Understood crisis communication protocols ✓Remediation data available to IT Operations
  • 40.
    Integrated (complicated) ✓Team rotations, pagingpolicies, role hunting ✓Continuous improvement of key health indicators ✓Technical collaboration across all incidents ✓Docs up to date and easily accessible ✓Consistent real-time communication practices
  • 41.
    Strategic (complex) ✓Automated docs andremediation ✓Actionable Alerts with full context ✓High collaboration among all teams ✓Documentation part of remediation ✓Targeted, proactive crisis comms ✓High focus on continuous learning
  • 42.
    “Six Trends ShapeDevOps Adoption, Q1 2015” Forrester report • The Foundation For Success Is In Place . . . Mostly • Fear Of Failure Will Hamper Advancement • Monitoring And Analytics Strategies Must Make A Big Leap Forward • The Focus On Customer Experience Is Not Second Nature . . . Yet • Change And Release Processes Are Not Delivering Business Needs • You Must Prioritize And Focus Sourcing Strategies
  • 43.
  • 44.
  • 45.
    Failure not seenas opportunity to learn Source: “Six Trends Shape DevOps Adoption, Q1 2015”, Forrester report
  • 46.
  • 47.
    © 2015 ForresterResearch, Inc. Reproduction Prohibited 47 Single Source Of Truth Lacking In Many Orgs – 95% only most of the time or less Source: April 15, 2015 “Six Trends That Will Shape DevOps Adoption”, Forrester report
  • 48.
  • 49.
    Teams siloed throughoutlife cycle Source: “Six Trends Shape DevOps Adoption, Q1 2015”, Forrester report
  • 50.
  • 51.
    © 2015 ForresterResearch, Inc. Reproduction Prohibited 51 IT teams aren’t measured on customer experience goals.
  • 52.
  • 54.
    Delays in notificationsLeads To Customers Finding the Problem First Source: “Six Trends Shape DevOps Adoption, Q1 2015”, Forrester report
  • 55.
  • 56.
    Reduce MTTR State ofDevOps Report (2015) – by Puppet Labs
  • 57.
  • 58.
  • 65.
    GUILLAUME BINET | TECHLEAD, SENIOR SOFTWARE ENGINEER • At Google since 2013, and is working on the Signals team for Google Cloud. • Previously held responsibilities at startups in Europe and the US, including an API Tech Lead at Twilio • Author of Errbot (http://errbot.io) • When note working or hacking on Errbot, you can find him car racing, scuba diving or tinkering on his latest electronic project #DevOpsRoadTrip
  • 66.
    Chatops ? Guillaume Binet 2016VictorOps Devops Roadtrip SFO Hosted by
  • 67.
  • 68.
    w00t! let’s roll! Build #423 started... This is getting exciting ! The press release is ready. Build #423 finished. Triggering tests on image avengers-v2- 423. avengers-v2-423: all tests are green. !build avengers-v2 ok, I just pushed the latest touches on branch avengers-v2 68
  • 69.
    $ nohup pythonr2d2.py &
  • 70.
  • 71.
    Dependency graph >!issue deps1284 1284 - User Story App skins Assignee: gbin 1285 - Feature Skin dls Assignee: gbin 1289 - Feature Android skins Assignee: stevo 1287 - Feature Skin console Assignee: gbin
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
    Alerts CRITICAL: On prod,trebuchet-3 is at 80% CPU.
  • 77.
    End to endtest Test call failed 3 times in a row on prod.
  • 78.
  • 79.
  • 80.
  • 81.
    "This thing is critical!" O.Dauby, Ops Manager
  • 82.
  • 83.
    Easy to startwith ... 83 Hot changes resilient Chatops on Chatops Accessible to non- programmers
  • 84.
    … but verypowerful ! 84 Conversations Automation Filter / security Persistence Templates Provisioning
  • 85.
  • 86.
  • 88.
    SEAN FITZGERALD | SOFTWAREENGINEER, SNAPCHAT • (seanfitz) is a Software Engineer focused on infrastructure at Snapchat. • Career ranging from startups as small 5 to a tour of duty at Amazon. #DevOpsRoadTrip
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
    AARON MERRILL | NOCMANAGER, SONY INTERACTIVE ENTERTAINMENT – PLAYSTATION • Process leader and evangelist with 20 years worth of experience leveraging technical expertise to improve operations efficiency and reliability, while reducing operational costs • Working with Fortune 500 companies including NewsCorp, Fidelity, Sempra Energy and Sony • Effective in reducing time-to-resolution of priority incidents by double digit percentages through enhanced technologies and improved process design. #DevOpsRoadTrip
  • 96.
    96 24X7 Monitoring Around theclock monitoring ensuring the availability of IT services supporting the advancement of Sony Interactive Entertainment - WWS Service Restoration Expert outage triage and high impacting incident identification leading to rapid action and resolution Notification Timely notification to our WWS clients and assembling the appropriate resources to drive resolutions as quickly as possible Problem Management Driving the documentation and root cause analysis on all high impacting incidents - ensuring resolution and proactive prevention of future events Maintenance Providing skilled, flexible, client focused first-level game and IT service maintenance 24 hours a day Reporting Consistent accurate reporting on incident and performance metrics Detailed post-mortem documentation on all high impacting outages
  • 97.
    ❖ Increasing costof toolsets ❖ Time consuming activities that lend themselves to automation ❖ Need for better integration with current monitoring systems ❖ Better context around alerts from disparate systems ❖ Better reporting - MTTA, MTTR, and incident event chronologies ❖ Managing incident notification subscriptions ❖ Faster means of communicating during incident CHALLENGES
  • 98.
    ❖ Re-routing incidentsto one or more teams ❖ Automated ticket creation ❖ Ability to quiet un-actionable or false alarms ❖ IT communication during incident (chat, automated conference calling, Slack integration) ❖ Providing better contextual information (triage docs & runbooks) along with alerts to quickly solve the incidents ❖ Real-time on-call handoffs and schedule overrides ❖ Continuous incident documentation for post-mortem reporting for RCA creation ❖ Reporting on MTTA, MTTR, and incident event chronologies SOLUTION
  • 99.
  • 100.
    ❖ Integrated withVictorOps ❖ Self service subscription based communications (SMS / Email) ❖ Scheduled maintenance with reminders ❖ Automatically displays the status of Infrastructure providers (Box, Amazon, etc.) ❖ Provide users incident post mortems reports ❖ Integrated translation services SOLUTION
  • 101.
  • 102.
    ❖ Enhanced real-timecommunications to our internal clients ❖ Further tool integrations and reporting ❖ Providing enhanced contextual information along with alerts to quickly solve the incidents at hand ❖ Correlation of change, incidents and problems ❖ Enhanced knowledge articles and runbooks ❖ Translation in Slack channels for real-time communication Ongoing improvements
  • 103.
  • 105.
  • 106.
  • 107.
    Breakout Sessions ◻ ChatOps- Guillaume Binet, Tech Lead/Senior Software Engineer, Google ◻ Security & Compliance in a DevOps World - J. Paul Reed, DevOps Consultant ◻ Finding Signal in the Noise – Aneel Lakhani, SignalFX ◻ DevOps Culture and Burnout – Ken Mugrage, Tech Evangelist, Thoughtworks ◻ Straddling Deployment and Operations – Aaron Merrill, NOC Manager ◻ DevOps Unlocked – Social Contracts and Code – Sean Fitzgerald, Software Engineer at Snapchat #DevOpsRoadTrip
  • 108.
    KEN MUGRAGE | TECHNICALEVANGELIST, THOUGHTWORKS • 25 years experience in IT • Focused on Continuous Delivery and DevOps for much of the past decade. • Worked with organizations all over the world, ranging from startups to Fortune 500 companies. • Passionate about helping others get better at building, testing and deploying software and using technology to increase business effectiveness, as opposed to using the “latest cool thing” #DevOpsRoadTrip
  • 109.
    ANEEL LAKHANI | SignalFX • In technology full-time since high school – from startups to consulting to teaching to big tech companies to analyst-ing and back to startups. • Former research director at Gartner #DevOpsRoadTrip
  • 110.
    J. PAUL REED| DEVOPS CONSULTANT • Over a decade of experience in the trenches as a build/release and tools engineer, working with such organizations as VMware, Mozilla, and Symantec. • In 2012, he founded Release Engineering Approaches, a consultancy incorporating a host of tools and techniques to help organizations “Simply Ship. Every time.” • Worked with organizations across a number of industries, from financial services to cloud-based infrastructure, with teams from 2 to 200. • Paul is also a founding host of The Ship Show, a twice-monthly podcast tackling topics related to build engineering, DevOps, and release management. #DevOpsRoadTrip
  • 111.
    Breakout Sessions ◻ ChatOps- Guillaume Binet, Tech Lead/Senior Software Engineer, Google ◻ Security & Compliance in a DevOps World - J. Paul Reed, DevOps Consultant ◻ Finding Signal in the Noise – Aneel Lakhani, SignalFX ◻ DevOps Culture and Burnout – Ken Mugrage, Tech Evangelist, Thoughtworks ◻ Straddling Deployment and Operations – Aaron Merrill, NOC Manager ◻ DevOps Unlocked – Social Contracts and Code – Sean Fitzgerald, Software Engineer at Snapchat #DevOpsRoadTrip
  • 112.
  • 113.
    J. PAUL REED| DEVOPS CONSULTANT • Over a decade of experience in the trenches as a build/release and tools engineer, working with such organizations as VMware, Mozilla, and Symantec. • In 2012, he founded Release Engineering Approaches, a consultancy incorporating a host of tools and techniques to help organizations “Simply Ship. Every time.” • Worked with organizations across a number of industries, from financial services to cloud-based infrastructure, with teams from 2 to 200. • Paul is also a founding host of The Ship Show, a twice-monthly podcast tackling topics related to build engineering, DevOps, and release management. #DevOpsRoadTrip
  • 114.
  • 116.
  • 117.
    How Organizations ProcessInformation Roy Westrum: A Typology of Organizational Cultures 2014 State of DevOps Report shows that in the context of IT, job satisfaction is the biggest predictor of profitability, market share, and productivity. The biggest predictor of job satisfaction, in turn, is how effectively organizations process information, as determined by a model created by sociologist Ron Westrum, shown below. 1 1: https://continuousdelivery.com/implementing/culture/
  • 118.
    TimeToRepair(TTR) Continuous Improvement Efforts Reactive(0 – 4) (chaotic) Tactical (5 – 9) (obvious) Integrated (10 -14) (complicated) Strategic (15 – 18) (complex) ✓ No automation ✓ No operational stack awareness ✓ Poor collaboration between teams (Dev & Ops) ✓ Documentation not available ✓ No standardized communication ✓ High focus on consistent continuous learning ✓ Uses a NOC ✓ Some monitoring & alerting instrumentation ✓ Collaboration in crisis ✓ "Mission critical" processes are available ✓ Understood crisis communication protocols ✓ Remediation data available to IT Operations ✓ Team rotations, paging policies, role hunting ✓ Continuous improvement of key health indicators ✓ Technical collaboration across all incidents ✓ Docs up to date and easily accessible ✓ Consistent real-time communication practices ✓ Automated docs and remediation ✓ Actionable Alerts with full context ✓ High collaboration among all teams ✓ Documentation part of remediation ✓ Targeted, proactive crisis comms ✓ High focus on continuous learning Incident Management Maturity
  • 119.
    TO POPULATE WITHREAL-TIME RESULTS
  • 120.
  • 121.
    Join us forHappy Hour Co-Sponsored by • Your nametag is your ticket • Down the elevator, turn towards Spear Street, and you’ll see it on your right. • One Market Street Restaurant #DevOpsRoadTrip
  • 122.
    DENVER - SEATTLE- SAN FRANCISCO - MINNEAPOLIS - NEW YORK CITY