The document describes the many challenges that can arise when resolving incidents in production systems. It shows how resolving a single issue requires coordination across various teams including NOC, SREs, developers, middleware, and management. This process involves opening tickets, collecting logs, troubleshooting, waiting for access or approvals, and context switching - all of which can introduce delays and interruptions.
Making Tomorrow Better than Today - Unlocking the Full Potential of OperationsRundeck
Keynote presentation by Damon Edward, co-founder of Rundeck, at DevOps Days Salt Lake City on May 15, 2019.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Tickets Make Operations Work Unnecessarily MiserableRundeck
Presentation by Damon Edwards, co-founder of Rundeck, at Interop ITX 2019 Las Vegas
Ticket-driven request queues have become the default way of working in operations for a long time and are the cornerstone of most operations management and ITSM strategies. But what if ticket queues are actually the source of much of the dysfunction, bottlenecks, and capacity issues that have traditionally plagued our organizations? This session will examine the dark side of ticket queues, including hidden costs and how they undermine DevOps and SRE transformations. Then we’ll explore alternative strategies high-performing operations organizations use to minimize dependence on tickets queues.
SRE for Everyone: Making Tomorrow Better Than Today Rundeck
Keynote presentation at DevOps Days Austin 2019 by Damon Edwards, co-founder of Rundeck.
Wouldn't everyone doing operations work love more time to focus on exciting projects? Build out new platforms, improve performance, contribute to open source projects, pay down tech debt, level-up their automation — all things that add value to your company and advance your career.
But instead, we find ourselves buried in interruptions and repetitive work. Imagine the things you could do, if you just had the time to get to it.
This talk is about applying ideas from the SRE movement that can be applied to any organization. Ideas that can help us all make tomorrow better than today.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
SysAdmin to SRE: Solving the Last Mile ProblemRundeck
Presented by Damon Edwards, co-founder of Rundeck, at DevOps Days Dallas on August 20, 2019.
Some DevOps transformations flourish, but others are stalling. Why is that? This talk will make the case that Operations is the most predictable differentiator.
So much of the energy in DevOps has been about activities that start in Dev and move towards Ops — continuous delivery, deployment pipelines, automated testing, and of course, the unofficial mantra of “deploy, deploy, deploy. “However, post-deployment, too many DevOps transformations maintain the status quo and leave questionable Operations practices in place.
Now along comes a new vision for Operations called SRE (a.k.a. Site Reliability Engineering)… But SRE seems almost too good to be true!
SREs are cover much of what systems administrators used to do, but get to spend most of their time doing engineering work that adds enduring value to their company? How is it that SREs’ don’t get caught up in the interruptions, repetitive work, and drudgery that consumes so much of our time? And how do companies use SRE to do so much more with the same or less headcount?
This talk will take a close look at what SRE is, what SRE isn’t, and how SRE avoids the pitfalls that have plagued traditional Ops work. Finally, we’ll break down the principles behind the SRE movement and highlight how early examples are proving that DevOps + SRE = the end-to-end speed and quality promised since the early days of DevOps.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Ops Happens: Improving Incident Response Using DevOps and SRE PracticesRundeck
Damon Edwards, co-founder of Rundeck, presents at Interop ITX in Las Vegas on May 3, 2018.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Clearing the Way For SRE In the Enterprise Rundeck
As presented by Damon Edwards, co-founder of Rundeck, at SREcon in Dusseldorf, Germany on 30 Aug 2018.
Video available here:
https://www.usenix.org/conference/srecon18europe/presentation/edwards
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today Rundeck
Damon Edwards, co-founder of Rundeck, talks at SCALE 17x on March 9, 2019 in Pasadena, CA.
Wouldn't everyone in operations love more time to work on exciting projects? Build out new platforms, improve performance, contribute to open source projects focus on security, level-up their automation — all things that add value to your companies and advance your career. But instead, the life of a traditional systems administrator is often buried in interruptions and repetitive work. Imagine the things you could do, if you just had the time to get to it.
Then along comes a new way of working and a new role called Site Reliability Engineering (SRE). But SRE almost seems too good to be true! People are doing what systems administrators used to do, but getting to spend more than 50% of their time doing engineering work that adds enduring value to their company? How can less than half of these SREs' time be wasted on the interruptions, repetitive work, and drudgery that seem to consume most of the traditional systems administrator's time? And do this with the same or less headcount?
This talk will first take a close look at what SRE is and what SRE isn't. We will break down the principles behind the SRE movement and highlight where SRE departs from the current conventional wisdom of Operations and Systems Administration work. You'll learn about key concepts like Toil, SLOs, Error Budgets, and Shared Responsibility Models.
Next, we'll look at how to move to an SRE style of working. We'll look at how traditional operations beliefs and practices can leave organizational scar tissue that is difficult to overcome. We'll examine examples of how silos, excessive toil, reliance on queues, and incorrectly applied governance models undermine the adoption of SRE principles and practices in the enterprise. We'll also look at the individual skills and mindset changes that you'll need to adopt an SRE way of working.
You'll leave this talk with an appreciation for how SRE can create the capacity you need to make tomorrow better than today.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Some DevOps transformations flourish, but many others are stalling. Why is that?
Damon Edwards, co-founder at Rundeck, makes the case that Operations is the difference maker.
As presented at Comcast DevOps Days Denver 2019
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Making Tomorrow Better than Today - Unlocking the Full Potential of OperationsRundeck
Keynote presentation by Damon Edward, co-founder of Rundeck, at DevOps Days Salt Lake City on May 15, 2019.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Tickets Make Operations Work Unnecessarily MiserableRundeck
Presentation by Damon Edwards, co-founder of Rundeck, at Interop ITX 2019 Las Vegas
Ticket-driven request queues have become the default way of working in operations for a long time and are the cornerstone of most operations management and ITSM strategies. But what if ticket queues are actually the source of much of the dysfunction, bottlenecks, and capacity issues that have traditionally plagued our organizations? This session will examine the dark side of ticket queues, including hidden costs and how they undermine DevOps and SRE transformations. Then we’ll explore alternative strategies high-performing operations organizations use to minimize dependence on tickets queues.
SRE for Everyone: Making Tomorrow Better Than Today Rundeck
Keynote presentation at DevOps Days Austin 2019 by Damon Edwards, co-founder of Rundeck.
Wouldn't everyone doing operations work love more time to focus on exciting projects? Build out new platforms, improve performance, contribute to open source projects, pay down tech debt, level-up their automation — all things that add value to your company and advance your career.
But instead, we find ourselves buried in interruptions and repetitive work. Imagine the things you could do, if you just had the time to get to it.
This talk is about applying ideas from the SRE movement that can be applied to any organization. Ideas that can help us all make tomorrow better than today.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
SysAdmin to SRE: Solving the Last Mile ProblemRundeck
Presented by Damon Edwards, co-founder of Rundeck, at DevOps Days Dallas on August 20, 2019.
Some DevOps transformations flourish, but others are stalling. Why is that? This talk will make the case that Operations is the most predictable differentiator.
So much of the energy in DevOps has been about activities that start in Dev and move towards Ops — continuous delivery, deployment pipelines, automated testing, and of course, the unofficial mantra of “deploy, deploy, deploy. “However, post-deployment, too many DevOps transformations maintain the status quo and leave questionable Operations practices in place.
Now along comes a new vision for Operations called SRE (a.k.a. Site Reliability Engineering)… But SRE seems almost too good to be true!
SREs are cover much of what systems administrators used to do, but get to spend most of their time doing engineering work that adds enduring value to their company? How is it that SREs’ don’t get caught up in the interruptions, repetitive work, and drudgery that consumes so much of our time? And how do companies use SRE to do so much more with the same or less headcount?
This talk will take a close look at what SRE is, what SRE isn’t, and how SRE avoids the pitfalls that have plagued traditional Ops work. Finally, we’ll break down the principles behind the SRE movement and highlight how early examples are proving that DevOps + SRE = the end-to-end speed and quality promised since the early days of DevOps.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Ops Happens: Improving Incident Response Using DevOps and SRE PracticesRundeck
Damon Edwards, co-founder of Rundeck, presents at Interop ITX in Las Vegas on May 3, 2018.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Clearing the Way For SRE In the Enterprise Rundeck
As presented by Damon Edwards, co-founder of Rundeck, at SREcon in Dusseldorf, Germany on 30 Aug 2018.
Video available here:
https://www.usenix.org/conference/srecon18europe/presentation/edwards
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
SysAdmin to SRE: Creating Capacity to Make Tomorrow Better Than Today Rundeck
Damon Edwards, co-founder of Rundeck, talks at SCALE 17x on March 9, 2019 in Pasadena, CA.
Wouldn't everyone in operations love more time to work on exciting projects? Build out new platforms, improve performance, contribute to open source projects focus on security, level-up their automation — all things that add value to your companies and advance your career. But instead, the life of a traditional systems administrator is often buried in interruptions and repetitive work. Imagine the things you could do, if you just had the time to get to it.
Then along comes a new way of working and a new role called Site Reliability Engineering (SRE). But SRE almost seems too good to be true! People are doing what systems administrators used to do, but getting to spend more than 50% of their time doing engineering work that adds enduring value to their company? How can less than half of these SREs' time be wasted on the interruptions, repetitive work, and drudgery that seem to consume most of the traditional systems administrator's time? And do this with the same or less headcount?
This talk will first take a close look at what SRE is and what SRE isn't. We will break down the principles behind the SRE movement and highlight where SRE departs from the current conventional wisdom of Operations and Systems Administration work. You'll learn about key concepts like Toil, SLOs, Error Budgets, and Shared Responsibility Models.
Next, we'll look at how to move to an SRE style of working. We'll look at how traditional operations beliefs and practices can leave organizational scar tissue that is difficult to overcome. We'll examine examples of how silos, excessive toil, reliance on queues, and incorrectly applied governance models undermine the adoption of SRE principles and practices in the enterprise. We'll also look at the individual skills and mindset changes that you'll need to adopt an SRE way of working.
You'll leave this talk with an appreciation for how SRE can create the capacity you need to make tomorrow better than today.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Some DevOps transformations flourish, but many others are stalling. Why is that?
Damon Edwards, co-founder at Rundeck, makes the case that Operations is the difference maker.
As presented at Comcast DevOps Days Denver 2019
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Damon Edwards, co-founder of Rundeck, presentation at NewOps Days in Raleigh, NC on December 4, 2018.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Modern Operations: Solving DevOps’ Last Mile Problem Rundeck
Damon Edwards, co-founder of Rundeck, presentation at Nike internal DevOps Day in Beaverton, OR on June 18, 2018.
This talk looks at the forces that fundamentally undermine operations work and what needs to be addressed if enterprises are going to get the most out of their digital transformation and DevOps initiatives.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Incident Management in the Age of DevOps and SRE Rundeck
Damon Edwards, co-founder of Rundeck, presents at Salt Lake City DevOps Meetup, November 13, 2019.
There is no doubt that DevOps has changed how we deliver software. But what about after deployment? Whether you are in a traditional operations organization or a “you build it, you run it” team, how do you mobilize, resolve, and learn from incidents? This talk will look at how high performing organizations have applied DevOps and SRE practices to shorten incidents and reduce escalations. Less frustration for the engineers. Lower costs for the business. Everybody wins.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Failure Happens: Improving Incident Response In Enterprises Rundeck
Presentation by Damon Edwards, co-founder of Rundeck at USENIX LISA in San Francisco, CA on November 3, 2017
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Operations as a Service: Because Failure Still Happens Rundeck
Presentation by Damon Edwards, co-founder of Rundeck, at All Day DevOps on October 24, 2017.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Operations: The Last Mile Problem For DevOpsRundeck
Presented by Damon Edwards, co-founder of Rundeck, at DevOps Enterprise Summit London 2018
View the video here:
https://www.youtube.com/watch?v=dp76E7j0FdQ&t=755s
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Incident Management in the Age of DevOps and SRE Rundeck
Keynote presentation at DevOps Con Munich, December 3, 2019, presented by Damon Edwards, co-founder of Rundeck.
Responding to incidents has always been the core job of Operations. With the rise of DevOps and SRE, how Operations work gets done — and who is doing the work — is changing. This talk will look at how high-performing organizations are applying DevOps and SRE practices to shorten incidents and reduce escalations.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Damon Edwards, co-founder of Rundeck, presentation at Nexus Conf 2018 on how Security teams can help Operations and, in turn, help themselves.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Rundeck
Presentation by Damon Edwards, co-founder of Rundeck, at DevOps Enterprise Summit in San Francisco, November 13, 2017
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Self-Service Operations: Because Failure Still Happens (Developer Edition)Rundeck
Keynote presentation at DevNet Create 2017 by Damon Edwards, co-founder of Rundeck.
Agile and DevOps have provided plenty of lessons for how to speed up the pace of application delivery and the frequency of application deployment. But delivery and deployment only covers one part of the day-to-day life of developers in large enterprises. What about what happens after deployment? In many enterprises, increasing the pace of delivery and frequency of deployment has just increased the operational support load, work interrupts, and context switching that were already cutting deeply into development teams' time.
This talk will focus on the successful design patterns that high-performing, large scale organizations have applied to reduce the operational burden and support costs across their entire organization. Specifically, we’ll look at how they apply DevOps principles to improving the post-deployment lifecycle and how Developers play the key role in reducing the difficultly and cost of operations activity for everyone.
Self-Service Operations: Because Ops Still HappensRundeck
Keynote Presentation by Damon Edwards, co-founder of Rundeck, at DevOps Days Austin , May 4, 2017.
Deployment is a solved problem. Sure there is still work to be done, but the DevOps community has successfully proven that anyone can both scale deployment automation and distribute the capability to execute deployments. Now, we have to turn our attention to the next critical constraint: What happens after deployment?
We all know that failure is inevitable and is coming our way at any moment. How do we respond quickly and effectively to those failures? What works when there is just a small set of teams or an isolated system to manage will quickly break down when the organization grows in size and complexity. But on the other hand, what has been commonly practiced in large-scale enterprises is proving to be too cumbersome, too silo dependent, and simply too slow for today's business needs.
How do we rapidly respond to incidents and recover complex interdependent systems while working within an equally complex and interdependent organization? How do Ops teams embrace the DevOps and Agile inspired demand for speed while maintaining quality and control?
This talk examines the trial-and-error lessons learned by some forward-thinking enterprises who are currently streamlining how they:
-Resolve incidents
-Reduce friction between teams
-Divide up operational responsibilities
-Improve the quality of their ongoing operations (and organizational learning)
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
How to bootstrap an SRE team into your company. How to hire them, what to have them work on and how to interact with them as a team. Finally some thought on general practices to consider before your SREs arrive. There are also kitten pictures.
Storage, network and computational resources are becoming API driven. Configuration management tools provide another level of automation and semantics to the systems. As these tools evolve the exercise of building systems looks more and more like software development. Further, when developing web applications, the application is the infrastructure. If the servers are down, there is no application. The value of the application is tied to the systems. Treating the systems and application holistically, encouraging communication and collaboration between dev and ops is the path to true artisanal retro-futurism ⊗ team-scale anarcho-syndicalism.
The prelude to the talks at Velocity and Agile 2009. A few of the same slides and sentiment, but presented in a different way. More mentions of Puppet specifically for one.
My History with Atlassian Tools, and Why I'm Moving to StudioAtlassian
Ever considered going hosted? Find out why SaaS may make sense for your development teams. This session shares one customer's experiences with Atlassian developer tools, and their thinking around JIRA Studio and on-demand dev suites.
Customer Speaker: Jeff Schnitter of Work Day
Key Takeaways:
* Best practices for efficient dev teams
* Tips and tricks for tuning Atlassian's dev tools suite
* Considerations for SaaS dev suites
Teaching Elephants to Dance (and Fly!) A Developer's Journey to Digital Trans...Burr Sutter
We can be brilliant developers, but we won’t succeed—and won’t lead our organizations to succeed—without a new perspective (if you will) and new assumptions about the components of the “technology ecosystem” that are fundamentally critical to our success. This includes the operators, QA team, DBAs, security folks, and even the pure business contingent—in most cases, each of these individuals and groups plays a critical role in the success of what we create and give birth to as developers. What we do in isolation might be genius, but if we insulate ourselves—especially with arrogance—from these colleagues, neither our code nor our organizations will realize their full potential, and most will fail. The bottom line is that our old ways are no longer viable, and as the elite within our industry, we will be the leaders and heroes who discard old assumptions and adopt a new perspective in this exciting journey to digital transformation—where the impossible can become reality.
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...Burr Sutter
We can be brilliant developers, but we won’t succeed—and won’t lead our organizations to succeed—without a new perspective (if you will) and new assumptions about the components of the “technology ecosystem” that are fundamentally critical to our success. This includes the operators, QA team, DBAs, security folks, and even the pure business contingent—in most cases, each of these individuals and groups plays a critical role in the success of what we create and give birth to as developers. What we do in isolation might be genius, but if we insulate ourselves—especially with arrogance—from these colleagues, neither our code nor our organizations will realize their full potential, and most will fail. The bottom line is that our old ways are no longer viable, and as the elite within our industry, we will be the leaders and heroes who discard old assumptions and adopt a new perspective in this exciting journey to digital transformation—where the impossible can become reality.
On April 10, 2013, Eric Mattison gave a talk on Tastypie: Easy APIs to Make Your Work Easier.
"Have you ever dealt with any of these problems:
- Unwieldy, Scary-to-Change Applications?
- Long Development Cycles?
- Replicated Code?
- Scope Creep?
- Restless Leg Syndrome?
Tastypie can help you solve these problems and more!”
Damon Edwards, co-founder of Rundeck, presentation at NewOps Days in Raleigh, NC on December 4, 2018.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Modern Operations: Solving DevOps’ Last Mile Problem Rundeck
Damon Edwards, co-founder of Rundeck, presentation at Nike internal DevOps Day in Beaverton, OR on June 18, 2018.
This talk looks at the forces that fundamentally undermine operations work and what needs to be addressed if enterprises are going to get the most out of their digital transformation and DevOps initiatives.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Incident Management in the Age of DevOps and SRE Rundeck
Damon Edwards, co-founder of Rundeck, presents at Salt Lake City DevOps Meetup, November 13, 2019.
There is no doubt that DevOps has changed how we deliver software. But what about after deployment? Whether you are in a traditional operations organization or a “you build it, you run it” team, how do you mobilize, resolve, and learn from incidents? This talk will look at how high performing organizations have applied DevOps and SRE practices to shorten incidents and reduce escalations. Less frustration for the engineers. Lower costs for the business. Everybody wins.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Failure Happens: Improving Incident Response In Enterprises Rundeck
Presentation by Damon Edwards, co-founder of Rundeck at USENIX LISA in San Francisco, CA on November 3, 2017
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Operations as a Service: Because Failure Still Happens Rundeck
Presentation by Damon Edwards, co-founder of Rundeck, at All Day DevOps on October 24, 2017.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Operations: The Last Mile Problem For DevOpsRundeck
Presented by Damon Edwards, co-founder of Rundeck, at DevOps Enterprise Summit London 2018
View the video here:
https://www.youtube.com/watch?v=dp76E7j0FdQ&t=755s
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Incident Management in the Age of DevOps and SRE Rundeck
Keynote presentation at DevOps Con Munich, December 3, 2019, presented by Damon Edwards, co-founder of Rundeck.
Responding to incidents has always been the core job of Operations. With the rise of DevOps and SRE, how Operations work gets done — and who is doing the work — is changing. This talk will look at how high-performing organizations are applying DevOps and SRE practices to shorten incidents and reduce escalations.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Damon Edwards, co-founder of Rundeck, presentation at Nexus Conf 2018 on how Security teams can help Operations and, in turn, help themselves.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Rundeck
Presentation by Damon Edwards, co-founder of Rundeck, at DevOps Enterprise Summit in San Francisco, November 13, 2017
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Self-Service Operations: Because Failure Still Happens (Developer Edition)Rundeck
Keynote presentation at DevNet Create 2017 by Damon Edwards, co-founder of Rundeck.
Agile and DevOps have provided plenty of lessons for how to speed up the pace of application delivery and the frequency of application deployment. But delivery and deployment only covers one part of the day-to-day life of developers in large enterprises. What about what happens after deployment? In many enterprises, increasing the pace of delivery and frequency of deployment has just increased the operational support load, work interrupts, and context switching that were already cutting deeply into development teams' time.
This talk will focus on the successful design patterns that high-performing, large scale organizations have applied to reduce the operational burden and support costs across their entire organization. Specifically, we’ll look at how they apply DevOps principles to improving the post-deployment lifecycle and how Developers play the key role in reducing the difficultly and cost of operations activity for everyone.
Self-Service Operations: Because Ops Still HappensRundeck
Keynote Presentation by Damon Edwards, co-founder of Rundeck, at DevOps Days Austin , May 4, 2017.
Deployment is a solved problem. Sure there is still work to be done, but the DevOps community has successfully proven that anyone can both scale deployment automation and distribute the capability to execute deployments. Now, we have to turn our attention to the next critical constraint: What happens after deployment?
We all know that failure is inevitable and is coming our way at any moment. How do we respond quickly and effectively to those failures? What works when there is just a small set of teams or an isolated system to manage will quickly break down when the organization grows in size and complexity. But on the other hand, what has been commonly practiced in large-scale enterprises is proving to be too cumbersome, too silo dependent, and simply too slow for today's business needs.
How do we rapidly respond to incidents and recover complex interdependent systems while working within an equally complex and interdependent organization? How do Ops teams embrace the DevOps and Agile inspired demand for speed while maintaining quality and control?
This talk examines the trial-and-error lessons learned by some forward-thinking enterprises who are currently streamlining how they:
-Resolve incidents
-Reduce friction between teams
-Divide up operational responsibilities
-Improve the quality of their ongoing operations (and organizational learning)
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
How to bootstrap an SRE team into your company. How to hire them, what to have them work on and how to interact with them as a team. Finally some thought on general practices to consider before your SREs arrive. There are also kitten pictures.
Storage, network and computational resources are becoming API driven. Configuration management tools provide another level of automation and semantics to the systems. As these tools evolve the exercise of building systems looks more and more like software development. Further, when developing web applications, the application is the infrastructure. If the servers are down, there is no application. The value of the application is tied to the systems. Treating the systems and application holistically, encouraging communication and collaboration between dev and ops is the path to true artisanal retro-futurism ⊗ team-scale anarcho-syndicalism.
The prelude to the talks at Velocity and Agile 2009. A few of the same slides and sentiment, but presented in a different way. More mentions of Puppet specifically for one.
My History with Atlassian Tools, and Why I'm Moving to StudioAtlassian
Ever considered going hosted? Find out why SaaS may make sense for your development teams. This session shares one customer's experiences with Atlassian developer tools, and their thinking around JIRA Studio and on-demand dev suites.
Customer Speaker: Jeff Schnitter of Work Day
Key Takeaways:
* Best practices for efficient dev teams
* Tips and tricks for tuning Atlassian's dev tools suite
* Considerations for SaaS dev suites
Teaching Elephants to Dance (and Fly!) A Developer's Journey to Digital Trans...Burr Sutter
We can be brilliant developers, but we won’t succeed—and won’t lead our organizations to succeed—without a new perspective (if you will) and new assumptions about the components of the “technology ecosystem” that are fundamentally critical to our success. This includes the operators, QA team, DBAs, security folks, and even the pure business contingent—in most cases, each of these individuals and groups plays a critical role in the success of what we create and give birth to as developers. What we do in isolation might be genius, but if we insulate ourselves—especially with arrogance—from these colleagues, neither our code nor our organizations will realize their full potential, and most will fail. The bottom line is that our old ways are no longer viable, and as the elite within our industry, we will be the leaders and heroes who discard old assumptions and adopt a new perspective in this exciting journey to digital transformation—where the impossible can become reality.
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...Burr Sutter
We can be brilliant developers, but we won’t succeed—and won’t lead our organizations to succeed—without a new perspective (if you will) and new assumptions about the components of the “technology ecosystem” that are fundamentally critical to our success. This includes the operators, QA team, DBAs, security folks, and even the pure business contingent—in most cases, each of these individuals and groups plays a critical role in the success of what we create and give birth to as developers. What we do in isolation might be genius, but if we insulate ourselves—especially with arrogance—from these colleagues, neither our code nor our organizations will realize their full potential, and most will fail. The bottom line is that our old ways are no longer viable, and as the elite within our industry, we will be the leaders and heroes who discard old assumptions and adopt a new perspective in this exciting journey to digital transformation—where the impossible can become reality.
On April 10, 2013, Eric Mattison gave a talk on Tastypie: Easy APIs to Make Your Work Easier.
"Have you ever dealt with any of these problems:
- Unwieldy, Scary-to-Change Applications?
- Long Development Cycles?
- Replicated Code?
- Scope Creep?
- Restless Leg Syndrome?
Tastypie can help you solve these problems and more!”
"Product Architecture: failures and lessons learnt" - Royi Benyossef @Product...Product of Things
Product architecture is the scheme by which the function of a product is allocated to physical components. The process includes building out a software and hardware product, while simultaneously conducting market research, receiving customer feedback, and developing the hardware, must be an informed and strategic process.In his session Royi will discuss the various architectures that were required for his team to develop in order to achieve different, yet optimal product versions for the Vidmind product. Through each product version, Royi covered where they went wrong and elaborate on what the company did to resolve these challenges in the next version and of course the outcome of each change that was implemented.
Incident Management in the Age of DevOps and SRE Rundeck
Presented by Damon Edwards, co-founder of Rundeck, at QCon San Francisco 2019.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
The Ember.js Framework - Everything You Need To KnowAll Things Open
All Things Open 2014 - Day 2
Thursday, October 23rd, 2014
Yehuda Katz
Founder of Tilde
Front Dev 1
The Ember.js Framework - Everything You Need To Know
Innovation dank DevOps (DevOpsCon Berlin 2015)Wooga
“You build it, you run it!” - Wenn Du als Entwickler weisst, dass Du Deine Software selbst betreiben musst, was bist bereit zu tun, um den späteren Betrieb zu vereinfach?
Bei Wooga haben Dutzende von Teams ihre eigene Antwort auf die Frage gesucht und dabei von den Erfahrungen der anderen Teams gelernt. Herausgekommen ist ein großes Experimentierfeld beim Betrieb von Web Services - und eine technologische Innovation, die uns innerhalb weniger Iterationen von einem simplen LAMP-Stack zu lastabhängig skalierenden stateful Servern auf Basis von Erlang oder Akka gebracht hat.
Looking at historic, current and evolving approaches, I will take you through from how we used to 'live' edit on one server with HTML in the code; to implementing Template Toolkit and 'front end / back end' servers; to the addition of version control; all the way through to distributed caching, file systems and processing (aka Six Apart worship) with 15+ servers.
Ein Backend für ein weltweit erfolgreiches Social Game zu entwickeln ist nicht einfach, aber die eigentliche Herausforderung ist der Betrieb der Systeme! Bei Wooga sind dieselben zwei bis drei Entwickler für beides verantwortlich. Über die letzten zwei Jahren hat sich ein halbes Dutzend Teams dieser Herausforderung gestellt. Dabei konnten sie auf die Erfahrungen der vorhergehenden Teams zurückgreifen und hatten die Freiheit, eigene Lösungsansätze zu verfolgen. Der Vortrag wird die entstandene Evolution der Backends nachvollziehen: Anfangs LAMP, dann Ruby statt PHP, dann NoSQL statt MySQL und am Ende ohne Datenbank auf Basis von Erlang OTP. Die Darstellung dieser Reise wird anschaulich zeigen, welche Vorteile es haben kann, Entwicklern auch einmal freie Hand zu lassen.
Progressive Enhancement for JavaScript AppsCodemotion
"Progressive Enhancement for JavaScript Apps" by Garann Means.
When progressive enhancement was introduced as a concept, JavaScript applications seemed as relevant as flying cars. As JS became more powerful, it seemed we'd reach a point where we could forget PE entirely. For its original meaning, we now have rock-solid libraries and polyfills to provide abstractions that make PE easy. But as JS has advanced, we've started writing things that can't be polyfilled. We know now how to progressively enhance widgets and user interactions. We'll talk about how we progressively enhance entire applications, and why it's more important than ever that we do so.
Scaling Up Lookout was originally presented at Lookout's Scaling for Mobile event on July 25, 2013. R. Tyler Croy is a Senior Software Engineer at Lookout, Inc. Lookout has grown immensely in the last year. We've doubled the size of the company—added more than 80 engineers to the team, support 45+ million users, have over 1000 machines in production, see over 125,000 QPS and more than 2.6 billion requests/month. Our analysts use Hadoop, Hive, and MySQL to interactively manipulate multibillion row tables. With that, there are bound to be some growing pains and lessons learned.
Debugging Production Applications in Nomad using LightrunShaiAlmog1
Nomad makes it easy to deploy and orchestrate applications with varying topology, but the underlying applications still come with their own set of challenges that most observability tools still struggle with. Pinpointing the root cause of an issue is something those tools weren’t designed for. Unfortunately, at the scales enabled by Nomad there are no alternatives other than logging. Yes, logs can be a powerful tool. But using a log as your only tool to debug production environments concurrency and scale issues… That seems like bringing a stone age solution to a space age problem. To make matters worse, many of these issues just can’t be replicated locally, at those scales things are pretty different.
Continuous observability tools open a hatch into production and let us debug production servers, securly and at scale. In this talk we’ll discuss the Lightrun Nomad integration and how you can use Lightrun to debug and measure production services at the source code level.
Rundeck Community Office Hours: Using Variables with Job Steps Rundeck
Rundeck offers powerful runbook automation. Most Runbooks are complicated multi-step processes. We will show various examples of how to share data from one step to another through the use of Log Filters.
Come join this session to learn how to:
Use different types of Log Filters to gather variables from your Job Steps
Gather variables and use the values in other Job Steps
Use the Result Data feature to format your output in a consistent format regardless of the log output.
Most of what Rundeck does is via one of it’s plugins. There are already over 100+ plugins to perform various services including executing commands on nodes, performing step in a workflow, or sending notification about job status. There may be instances where you need to write your own plugin to perform a specific step or action. In this session, will walk through the steps for writing our own plugin.
In this session you'll learn:
Review the structure of plugin
How to use the structure and what information you need to include in other files to make your plugin work
How to write a simple plugin example using java
How to reply and use your plugin
Lunch and learn: Getting started with Rundeck & AnsibleRundeck
Operations teams depend on a mixture of tools to keep their systems running. One popular pairing for Rundeck users is integrating Ansible playbooks into Rundeck to orchestrate and schedule workflows across multiple tools.
Join us for this Lunch and Learn event to learn how you can use Rundeck to create runbooks that span your existing Ansible playbooks -- as well as any other scripts, tools, APIs, or systems commands, to respond to incidents or perform Operations tasks.
Join us to learn:
Benefits of using Rundeck and Ansible together
How to configure your Rundeck to use the Ansible plugin
Tips for getting started with the integration
And see a demo of the integration
This event is recommended for beginners.
Self Service Cloud Operations: Safely Delegate the Management of your Cloud ...Rundeck
Running Operations is not an easy job, especially these days. Ops teams have to ensure excellent user experiences, resolve incidents quickly and help developers stay productive. Yet at the same time, there is also the need to maintain systems security and keep downtime to a minimum.
While advances in cloud computing have helped address some of these challenges, many organizations find it difficult to leverage the cloud at scale because of bottlenecks that form around repetitive tasks, such as developers having to wait for provisioning infrastructure. Despite having access to abundant cloud resources, these speedbumps often make it difficult to achieve team objectives.
Join this talk to learn:
How to safely delegate the management of your cloud deployment (to developers and other end users) with self-service operations.
How to create powerful runbooks with guardrails that leverage existing scripting languages, infrastructure, and tools to remove bottlenecks that form around repetitive tasks.
Strategies for getting started with self-service.
Rundeck Office Hours: Best Practices Access Control PoliciesRundeck
Join us this month for an AMA discussion followed by a live Q&A led by technical experts from Rundeck’s engineering, product, and solution engineering teams. Experts are available to provide advice on your technical architecture, give recommendations for operational best practices, review current Github issues, or dive into the open source code itself.
Don’t miss the opportunity to learn Rundeck product best practices and ask experts your questions about Rundeck.
https://www.rundeck.com/rundeck-office-hours
Secure IT infrastructure is well protected by access keys, passwords, and other credentials. Admins need these secrets to gain access, as does any automation executed by Rundeck. Rundeck has rich support for secrets management with native key storage, as well as integrations with best-of-breed standardized solutions. In this webinar, we’ll cover best practices for working with Rundeck’s runbook automation platform in securing IT infrastructure. We’ll explore the secrets management options in Rundeck and we’ll highlight a new plugin with Thycotic Secret Server for Privileged Access Management.
In this webinar, we will demonstrate:
How Rundeck works with underlying secrets of the systems it manages
New Rundeck plugins that allow users to protect privileged accounts with enterprise-grade, privileged access management solutions
How you can use Rundeck plugins with HashiCorp Vault, Thycotic, and CyberArk as keys for jobs and other Rundeck configurations
In this session we will give a live walkthrough covering new capabilities released in Rundeck 3.4. Learn about security & compliance improvements we’ve made including the ability to organize secrets management by project -- so now each Runbook can access a different set of passwords and keys for its access control list (ACL). We also have a new plug-in for Thycotic users to manage secrets. Rundeck 3.4 now allows for queueing of jobs when those jobs must be run serially. Finally, we’ll discuss our vision for the future of Rundeck, and our primary development themes for the next year.
Automate Yourself Out of a Job: Safely Delegate the Management of your Azure...Rundeck
Running Operations is not an easy job, especially these days. Ops teams have to ensure excellent user experiences, resolve incidents quickly and help developers stay productive. Yet at the same time, there is also the need to maintain systems security and keep downtime to a minimum - goals which many struggle with at scale.
While advances in cloud computing have helped address some of these challenges, many organizations find it difficult to leverage the cloud at scale because of bottlenecks that form around repetitive tasks, such as developers having to wait for provisioning infrastructure. Despite having access to abundant cloud resources, these speedbumps often make it difficult or impossible to achieve team objectives.
Join this talk to learn:
-How to safely delegate the management of your Azure deployment (to developers and other colleagues) with self-service operations.
-How to create powerful runbooks with guardrails that leverage existing scripting languages (including PowerShell), infrastructure, and tools to remove the human from the bottleneck that forms around repetitive tasks.
-Strategies for getting started
-And how to create an Easy Button to handle the repetitive tasks that are interrupting your flow of work.
As presented by Jesse Houldsworth at PowerShell + DevOps Global Summit 2021
Super-Charge Your Site Reliability Practices with Runbook Automation Rundeck
On Demand Viewing: https://www.rundeck.com/super-charge-reliability
To win in today’s digital age, organizations need to balance product reliability and feature delivery with dynamic business needs and legacy and multi-cloud environments. Automation, as a main SRE practice, scales product reliability practices by reducing tedious tasks related to production operations, freeing up engineers to work on innovation.
Whether you are in a traditional operations organization or a “you build it, you run it” team, this webinar will explore strategies for increasing automation to improve your Operations so you can continue to create excellent experiences for your customers.
-How you can reduce MTTR and eliminate toil with Self-Service Operations
-Common workflow challenges and opportunities
-How you can use Runbook Automation to enable Self-Service Operations
-Ways to leverage existing assets and workflows by integrating Rundeck with existing toolsets
-See a demo of real world cases
https://youtu.be/4jAf6cbxsgo
As operators, it’s our job to monitor infrastructure, systems and applications and only wake up humans for tasks machines can’t fix on their own. Automated remediation pairs monitoring and runbook automation, giving you a monitoring system that can trigger operational actions with runbook automation to shorten incident response times and avoid alert fatigue.
Rundeck Director of Product Management Forrest Evans and Sensu Developer Advocate Todd Campbell discuss the key role automated remediation plays in the monitoring journey, with live demos of both the Rundeck and Sensu integrations. You’ll learn all about monitoring as code workflows with the Sensu Observability Pipeline and how to deliver runbook automation with Rundeck — and see how the two together can help you achieve automated remediation.
Failure is inevitable. But are you incurring more downtime and disruption than necessary? Legacy incident response techniques have difficulty keeping up with the increasing pace of change and skyrocketing complexity of today’s application environments.
During this webinar, you’ll learn about modern incident response techniques that can dramatically shorten incidents and reduce escalations. Join the experts from Rundeck and PagerDuty as they share:
*How a real-time operations platform intelligently manages alerts and on-call mobilization, delivering the right people the right information at the right time
*How runbook automation gives front-line response teams self-service access to run automated workflows – or runbooks – that diagnose and resolve incidents without escalating to an expert.
*How to automatically detect, diagnose, and resolve incidents without human intervention.
https://youtu.be/9yYwTPMRSOY
Nathan Fluegel, head of Customer Success at Rundeck, talks clustering and high availability. We'll show how to deploy Rundeck servers in a clustered configuration with Rundeck Enterprise.
https://youtu.be/PmBIGP3M9sI
Understand how to migrate your Rundeck environment from the community edition to Enterprise, including the pros and cons of each migratory approach.
In this webinar, you will learn how to:
-Determine which migration approach is most appropriate for your environment
-Shift from a single-server to clustered environment
-Migrate jobs and projects while keeping a clean install
Business Continuity for Humans: Keeping Your Business Running When Your Peopl...Rundeck
Damon Edwards (Rundeck) presentation from TechStrongConf on June 4, 2020.
Learn more: https://www.rundeck.com/business-continuity-for-digital-operations
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
19. SRE
“It’s a problem
with the Foo
service”
SRE
SRE
Foo
SRE
SRE
SRE
SRE
Bridge
Call
Biz
Manager
Foo
Service
No.
NOC
(Bob)
Update
Ticket
Ticket
Foo
Lead Dev
+ add
12:00pm
NOC (Bob)
Biz Manager
Foo SRE
Ticket
Context Wagon
Can you
fix it?
20. SRE
“It’s a problem
with the Foo
service”
SRE
SRE
Foo
SRE
SRE
SRE
SRE
Bridge
Call
Biz
Manager
Foo
Service
No.
NOC
(Bob)
Update
Ticket
Ticket
Foo
Lead Dev
+ add
12:00pm
NOC (Bob)
Biz Manager
Foo SRE
Ticket
Context Wagon
Can you
fix it?
Partially
Done
Work
21. SRE
“It’s a problem
with the Foo
service”
SRE
SRE
Foo
SRE
SRE
SRE
SRE
Bridge
Call
Biz
Manager
Foo
Service
No.
NOC
(Bob)
Update
Ticket
Ticket
Foo
Lead Dev
+ add
12:00pm
NOC (Bob)
Biz Manager
Foo SRE
Ticket
Context Wagon
Can you
fix it?
Partially
Done
Work
Escalation
22. SRE
“It’s a problem
with the Foo
service”
SRE
SRE
Foo
SRE
SRE
SRE
SRE
Bridge
Call
Biz
Manager
Foo
Service
No.
NOC
(Bob)
Update
Ticket
Ticket
Foo
Lead Dev
+ add
12:00pm
NOC (Bob)
Biz Manager
Foo SRE
Ticket
Context Wagon
Can you
fix it?
Partially
Done
Work
Escalation
Waiting
25. o
Dev
Foo
Lead Dev
(Karen)
ding!
Ignore.
App
Manager
Hey did you see
that ticket?
Foo
Lead Dev
(Karen)
sigh.
I’ll take a look
I’m go
mor
pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
Scrum
Ticket
Context Wagon
Context
Switching
Interruption
26. k
Foo
Lead Dev
(Karen)
I’m going to need
more log files
Ticket
SysAdmin Team
+ add
Update
Ticket
Chat
“Can someone with
access to Foo Service
in Prod01 help me with
ticket #42516?”
SysAdmin
(Lee) Ticket
“logs
attached”
Foo
Lead Dev
(Karen)
Ticket
“no the
other ones”
Le
(K
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Ticket
Context Wagon
27. k
Foo
Lead Dev
(Karen)
I’m going to need
more log files
Ticket
SysAdmin Team
+ add
Update
Ticket
Chat
“Can someone with
access to Foo Service
in Prod01 help me with
ticket #42516?”
SysAdmin
(Lee) Ticket
“logs
attached”
Foo
Lead Dev
(Karen)
Ticket
“no the
other ones”
Le
(K
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Ticket
Context Wagon
Disconnected
Access
28. k
Foo
Lead Dev
(Karen)
I’m going to need
more log files
Ticket
SysAdmin Team
+ add
Update
Ticket
Chat
“Can someone with
access to Foo Service
in Prod01 help me with
ticket #42516?”
SysAdmin
(Lee) Ticket
“logs
attached”
Foo
Lead Dev
(Karen)
Ticket
“no the
other ones”
Le
(K
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Ticket
Context Wagon
Disconnected
Access
Waiting
29. k
Foo
Lead Dev
(Karen)
I’m going to need
more log files
Ticket
SysAdmin Team
+ add
Update
Ticket
Chat
“Can someone with
access to Foo Service
in Prod01 help me with
ticket #42516?”
SysAdmin
(Lee) Ticket
“logs
attached”
Foo
Lead Dev
(Karen)
Ticket
“no the
other ones”
Le
(K
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Ticket
Context Wagon
Interruption
Disconnected
Access
Waiting
30. k
Foo
Lead Dev
(Karen)
I’m going to need
more log files
Ticket
SysAdmin Team
+ add
Update
Ticket
Chat
“Can someone with
access to Foo Service
in Prod01 help me with
ticket #42516?”
SysAdmin
(Lee) Ticket
“logs
attached”
Foo
Lead Dev
(Karen)
Ticket
“no the
other ones”
Le
(K
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Ticket
Context Wagon
Interruption
Disconnected
Access
Waiting
Context
Switch
31. Foo
Lead Dev
(Karen)
Logs
-Who restarted these services? (and why?)
-They didn’t use the correct environment
variables!
-This entire service pool needs to be restarted!
Ticket
Update
Ticket
NOC
(Bob)
Update
Ticket
Ticket
Middleware Team
+ add
“Middleware, please
urgent restart this entire
app pool with the correct
environment variable”
2:00pm
Ticket
Context W
32. Foo
Lead Dev
(Karen)
Logs
-Who restarted these services? (and why?)
-They didn’t use the correct environment
variables!
-This entire service pool needs to be restarted!
Ticket
Update
Ticket
NOC
(Bob)
Update
Ticket
Ticket
Middleware Team
+ add
“Middleware, please
urgent restart this entire
app pool with the correct
environment variable”
2:00pm
Ticket
Context W
Partially
Done
Work
33. Foo
Lead Dev
(Karen)
Logs
-Who restarted these services? (and why?)
-They didn’t use the correct environment
variables!
-This entire service pool needs to be restarted!
Ticket
Update
Ticket
NOC
(Bob)
Update
Ticket
Ticket
Middleware Team
+ add
“Middleware, please
urgent restart this entire
app pool with the correct
environment variable”
2:00pm
Ticket
Context W
Partially
Done
Work
Waiting
34. Foo
Lead Dev
(Karen)
Logs
-Who restarted these services? (and why?)
-They didn’t use the correct environment
variables!
-This entire service pool needs to be restarted!
Ticket
Update
Ticket
NOC
(Bob)
Update
Ticket
Ticket
Middleware Team
+ add
“Middleware, please
urgent restart this entire
app pool with the correct
environment variable”
2:00pm
Ticket
Context W
Partially
Done
Work
Waiting
Interruption
35. Foo
Lead Dev
(Karen)
Logs
-Who restarted these services? (and why?)
-They didn’t use the correct environment
variables!
-This entire service pool needs to be restarted!
Ticket
Update
Ticket
NOC
(Bob)
Update
Ticket
Ticket
Middleware Team
+ add
“Middleware, please
urgent restart this entire
app pool with the correct
environment variable”
2:00pm
Ticket
Context W
Partially
Done
Work
Waiting
Context
Switching
Interruption
36. ase
s entire
e correct
able”
NOC
(Bob)
Middleware
Manager
(Melissa)
No way. It’s the middle
of the day! You need
business approval.
NOC
(Bob)
Update
Ticket
Ticket
SVP for Line of
Business
+ add
(S
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
NOC (B
Biz Ma
App Ma
Lead D
Foo SR
Ticket
Context Wagon
Ticket
Context Wagon
2:30pm
37. ase
s entire
e correct
able”
NOC
(Bob)
Middleware
Manager
(Melissa)
No way. It’s the middle
of the day! You need
business approval.
NOC
(Bob)
Update
Ticket
Ticket
SVP for Line of
Business
+ add
(S
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
NOC (B
Biz Ma
App Ma
Lead D
Foo SR
Ticket
Context Wagon
Ticket
Context Wagon
2:30pm
Extra
Process
38. ase
s entire
e correct
able”
NOC
(Bob)
Middleware
Manager
(Melissa)
No way. It’s the middle
of the day! You need
business approval.
NOC
(Bob)
Update
Ticket
Ticket
SVP for Line of
Business
+ add
(S
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
NOC (B
Biz Ma
App Ma
Lead D
Foo SR
Ticket
Context Wagon
Ticket
Context Wagon
2:30pm
Extra
Process
Misaligned
Priorities
39. ase
s entire
e correct
able”
NOC
(Bob)
Middleware
Manager
(Melissa)
No way. It’s the middle
of the day! You need
business approval.
NOC
(Bob)
Update
Ticket
Ticket
SVP for Line of
Business
+ add
(S
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
NOC (B
Biz Ma
App Ma
Lead D
Foo SR
Ticket
Context Wagon
Ticket
Context Wagon
2:30pm
Interruption
Extra
Process
Misaligned
Priorities
40. ase
s entire
e correct
able”
NOC
(Bob)
Middleware
Manager
(Melissa)
No way. It’s the middle
of the day! You need
business approval.
NOC
(Bob)
Update
Ticket
Ticket
SVP for Line of
Business
+ add
(S
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
NOC (B
Biz Ma
App Ma
Lead D
Foo SR
Ticket
Context Wagon
Ticket
Context Wagon
2:30pm
Context
Switching
Interruption
Extra
Process
Misaligned
Priorities
41. Update
Ticket
Ticket
SVP for Line of
Business
+ add
SVP
(Susan)
Chief of
Staff
Tech VP
Tech VP
Update
Ticket
Ticket
“Restart approved”
Customer
impact?
Ticket
Middlewa
Manage
(Melissa
Wh
prod
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Ticket
Context Wagon
42. Update
Ticket
Ticket
SVP for Line of
Business
+ add
SVP
(Susan)
Chief of
Staff
Tech VP
Tech VP
Update
Ticket
Ticket
“Restart approved”
Customer
impact?
Ticket
Middlewa
Manage
(Melissa
Wh
prod
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Ticket
Context Wagon
Interruption
43. Update
Ticket
Ticket
SVP for Line of
Business
+ add
SVP
(Susan)
Chief of
Staff
Tech VP
Tech VP
Update
Ticket
Ticket
“Restart approved”
Customer
impact?
Ticket
Middlewa
Manage
(Melissa
Wh
prod
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Ticket
Context Wagon
Context
Switching
Interruption
44. Update
Ticket
Ticket
SVP for Line of
Business
+ add
SVP
(Susan)
Chief of
Staff
Tech VP
Tech VP
Update
Ticket
Ticket
“Restart approved”
Customer
impact?
Ticket
Middlewa
Manage
(Melissa
Wh
prod
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Ticket
Context Wagon
Context
Switching
Interruption
Disconnected
Context
45. Share
point
proved”
Ticket
Middleware
Manager
(Melissa)
Who knows these
production services
the best?
Ellen!
Middleware Middleware
(Scott)
Ellen
to
Europe
office
Middleware
(Scott)
Trial and error
.doc
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Ticket
Context Wagon
46. Share
point
proved”
Ticket
Middleware
Manager
(Melissa)
Who knows these
production services
the best?
Ellen!
Middleware Middleware
(Scott)
Ellen
to
Europe
office
Middleware
(Scott)
Trial and error
.doc
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Ticket
Context Wagon
Waiting
47. Share
point
proved”
Ticket
Middleware
Manager
(Melissa)
Who knows these
production services
the best?
Ellen!
Middleware Middleware
(Scott)
Ellen
to
Europe
office
Middleware
(Scott)
Trial and error
.doc
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Ticket
Context Wagon
Waiting
Siloed
Knowledge
48. Share
point
proved”
Ticket
Middleware
Manager
(Melissa)
Who knows these
production services
the best?
Ellen!
Middleware Middleware
(Scott)
Ellen
to
Europe
office
Middleware
(Scott)
Trial and error
.doc
5:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Ticket
Context Wagon
Waiting
Manual
Siloed
Knowledge
49. Share
point
Middleware
(Scott)
Trial and error
.doc
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
ket
Context Wagon
Middleware
(Scott)
Bar
Service
10 min Middleware
(Scott)
Waiting for
Acme Service
Acme startup
failed
Bar
Service
6:00pm
53. -Bar app startup timed out. Error says can’t
connect to Acme service.
- I looked at Acme but it seems to be running
-Is this error message correct? Why can’t Bar
connect?
Ticket
Update
Ticket
Middleware
(Scott)
Bar SRE
+ add
Bar SRE
(Linda)
Middleware
(Scott)
-URGENT: Network
connection issue
between Bar and
Acme
Ticket
Update
Ticket
Network
SRE Team
+ add
6:45
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)Ticket
Context Wagon
The new environment pre-flight
check is preventing startup.
Looks like Bar’s connection to
Acme is being blocked.
54. -Bar app startup timed out. Error says can’t
connect to Acme service.
- I looked at Acme but it seems to be running
-Is this error message correct? Why can’t Bar
connect?
Ticket
Update
Ticket
Middleware
(Scott)
Bar SRE
+ add
Bar SRE
(Linda)
Middleware
(Scott)
-URGENT: Network
connection issue
between Bar and
Acme
Ticket
Update
Ticket
Network
SRE Team
+ add
6:45
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)Ticket
Context Wagon
The new environment pre-flight
check is preventing startup.
Looks like Bar’s connection to
Acme is being blocked.
Escalation
55. -Bar app startup timed out. Error says can’t
connect to Acme service.
- I looked at Acme but it seems to be running
-Is this error message correct? Why can’t Bar
connect?
Ticket
Update
Ticket
Middleware
(Scott)
Bar SRE
+ add
Bar SRE
(Linda)
Middleware
(Scott)
-URGENT: Network
connection issue
between Bar and
Acme
Ticket
Update
Ticket
Network
SRE Team
+ add
6:45
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)Ticket
Context Wagon
The new environment pre-flight
check is preventing startup.
Looks like Bar’s connection to
Acme is being blocked.
Escalation
Task
Switching
56. Bar SRE
(Linda)
Middleware
(Scott)
-URGENT: Network
connection issue
between Bar and
Acme
Ticket
Update
Ticket
Network
SRE Team
+ add
Bar
Lead Dev
6:45pm
ob)
ager
nager
ev (Karen)
E
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Customers are
calling. What
is going on?The new environment pre-flight
check is preventing startup.
Looks like Bar’s connection to
Acme is being blocked.
Bar
Lead Dev
(Liu)
Business
Managers
I can comment out
the test… But the
CD pipeline only
goes to QA ENV!
57. Bar SRE
(Linda)
Middleware
(Scott)
-URGENT: Network
connection issue
between Bar and
Acme
Ticket
Update
Ticket
Network
SRE Team
+ add
Bar
Lead Dev
6:45pm
ob)
ager
nager
ev (Karen)
E
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Customers are
calling. What
is going on?The new environment pre-flight
check is preventing startup.
Looks like Bar’s connection to
Acme is being blocked.
Bar
Lead Dev
(Liu)
Business
Managers
I can comment out
the test… But the
CD pipeline only
goes to QA ENV!
Escalation
58. Bar SRE
(Linda)
Middleware
(Scott)
-URGENT: Network
connection issue
between Bar and
Acme
Ticket
Update
Ticket
Network
SRE Team
+ add
Bar
Lead Dev
6:45pm
ob)
ager
nager
ev (Karen)
E
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Customers are
calling. What
is going on?The new environment pre-flight
check is preventing startup.
Looks like Bar’s connection to
Acme is being blocked.
Bar
Lead Dev
(Liu)
Business
Managers
I can comment out
the test… But the
CD pipeline only
goes to QA ENV!
Escalation
Task
Switching
59. Bar SRE
(Linda)
Middleware
(Scott)
-URGENT: Network
connection issue
between Bar and
Acme
Ticket
Update
Ticket
Network
SRE Team
+ add
Bar
Lead Dev
6:45pm
ob)
ager
nager
ev (Karen)
E
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Customers are
calling. What
is going on?The new environment pre-flight
check is preventing startup.
Looks like Bar’s connection to
Acme is being blocked.
Bar
Lead Dev
(Liu)
Business
Managers
I can comment out
the test… But the
CD pipeline only
goes to QA ENV!
Escalation
Task
Switching
Disconnected
Process
60. Network Dir
(Carlos)
Middleware
(Scott)
Carlos, I need a favor.
Can you escalate?Middleware
Manager
(Melissa)
Customers are
calling. What
is going on?
Last week..
Net SRE
VP
VP
Priority!
Different
Incident!
Net SRE Net SRE
Net SRE
Its the network!
Business
Managers
Your
network is
broken!
Business
Managers
We are already
working on it!
Network VPs
out
he
ly
V!
61. Network Dir
(Carlos)
Middleware
(Scott)
Carlos, I need a favor.
Can you escalate?Middleware
Manager
(Melissa)
Customers are
calling. What
is going on?
Last week..
Net SRE
VP
VP
Priority!
Different
Incident!
Net SRE Net SRE
Net SRE
Its the network!
Business
Managers
Your
network is
broken!
Business
Managers
We are already
working on it!
Network VPs
out
he
ly
V!
Distraction
62. Network Dir
(Carlos)
Middleware
(Scott)
Carlos, I need a favor.
Can you escalate?Middleware
Manager
(Melissa)
Customers are
calling. What
is going on?
Last week..
Net SRE
VP
VP
Priority!
Different
Incident!
Net SRE Net SRE
Net SRE
Its the network!
Business
Managers
Your
network is
broken!
Business
Managers
We are already
working on it!
Network VPs
out
he
ly
V!
Distraction
Finger
Pointing
63. Network Dir
(Carlos)
Middleware
(Scott)
Carlos, I need a favor.
Can you escalate?Middleware
Manager
(Melissa)
Customers are
calling. What
is going on?
Last week..
Net SRE
VP
VP
Priority!
Different
Incident!
Net SRE Net SRE
Net SRE
Its the network!
Business
Managers
Your
network is
broken!
Business
Managers
We are already
working on it!
Network VPs
out
he
ly
V!
Distraction
Finger
Pointing
Heroics
64. Network Dir
(Carlos)
Middleware
(Scott)
Carlos, I need a favor.
Can you escalate?Middleware
Manager
(Melissa)
Customers are
calling. What
is going on?
Last week..
Net SRE
VP
VP
Priority!
Different
Incident!
Net SRE Net SRE
Net SRE
Its the network!
Business
Managers
Your
network is
broken!
Business
Managers
We are already
working on it!
Network VPs
out
he
ly
V!
Distraction
Finger
Pointing
Heroics
Waiting
65. Network
SRE
(Hari)
The firewall is
blocking the traffic
You’ll have to take
it up with the
Firewall Team
-URGENT: Firewall is
blocking connection
between Bar and Acme
Ticket
Open
Firewall
Ticket
Firewall
Team
+ add
Firewall Engineer
(Freddie)
Middleware
(Scott)
Paging on-call…
Open bridge…
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
8:00p
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Ticket
Context Wagon
66. Network
SRE
(Hari)
The firewall is
blocking the traffic
You’ll have to take
it up with the
Firewall Team
-URGENT: Firewall is
blocking connection
between Bar and Acme
Ticket
Open
Firewall
Ticket
Firewall
Team
+ add
Firewall Engineer
(Freddie)
Middleware
(Scott)
Paging on-call…
Open bridge…
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
8:00p
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Ticket
Context Wagon
Escalation
67. Network
SRE
(Hari)
The firewall is
blocking the traffic
You’ll have to take
it up with the
Firewall Team
-URGENT: Firewall is
blocking connection
between Bar and Acme
Ticket
Open
Firewall
Ticket
Firewall
Team
+ add
Firewall Engineer
(Freddie)
Middleware
(Scott)
Paging on-call…
Open bridge…
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
8:00p
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Ticket
Context Wagon
Escalation Interruption
68. Network
SRE
(Hari)
The firewall is
blocking the traffic
You’ll have to take
it up with the
Firewall Team
-URGENT: Firewall is
blocking connection
between Bar and Acme
Ticket
Open
Firewall
Ticket
Firewall
Team
+ add
Firewall Engineer
(Freddie)
Middleware
(Scott)
Paging on-call…
Open bridge…
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
8:00p
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Ticket
Context Wagon
Escalation
Task
Switching
Interruption
69. Network
SRE
(Hari)
The firewall is
blocking the traffic
You’ll have to take
it up with the
Firewall Team
-URGENT: Firewall is
blocking connection
between Bar and Acme
Ticket
Open
Firewall
Ticket
Firewall
Team
+ add
Firewall Engineer
(Freddie)
Middleware
(Scott)
Paging on-call…
Open bridge…
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
8:00p
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Ticket
Context Wagon
Escalation
Task
Switching
Siloed
Knowledge
Interruption
70. Firewall Engineer
(Freddie)
Middleware
(Scott)
Firewall Engineer
(Freddie)
Middleware
(Scott)
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
There was a rule change last
Thursday that would stop Bar
from talking to Acme.
Can you change it back?
Sure we make changes on
Thursday…
Chief of
Staff
SVP and VPs are livid… this was
supposed to be a safe change!!
Freddie, we’ve got customers calling.
ES
Em
pro
rul
Update
Firewall
Ticket
Firewall Engineer
(Freddie)
8:00pm
71. Firewall Engineer
(Freddie)
Middleware
(Scott)
Firewall Engineer
(Freddie)
Middleware
(Scott)
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
There was a rule change last
Thursday that would stop Bar
from talking to Acme.
Can you change it back?
Sure we make changes on
Thursday…
Chief of
Staff
SVP and VPs are livid… this was
supposed to be a safe change!!
Freddie, we’ve got customers calling.
ES
Em
pro
rul
Update
Firewall
Ticket
Firewall Engineer
(Freddie)
8:00pm
Extra
Process
72. Firewall Engineer
(Freddie)
Middleware
(Scott)
Firewall Engineer
(Freddie)
Middleware
(Scott)
Can’t be the firewall, it hasn’t
changed since last Thursday.
No its the firewall.
There was a rule change last
Thursday that would stop Bar
from talking to Acme.
Can you change it back?
Sure we make changes on
Thursday…
Chief of
Staff
SVP and VPs are livid… this was
supposed to be a safe change!!
Freddie, we’ve got customers calling.
ES
Em
pro
rul
Update
Firewall
Ticket
Firewall Engineer
(Freddie)
8:00pm
Extra
Process
Misaligned
Priorities
73. d VPs are livid… this was
sed to be a safe change!!
we’ve got customers calling.
ESCALATE:
Emergency
production firewall
rule change review
Ticket
Update
Firewall
Ticket
NetSec
+ add
Firewall Engineer
(Freddie)
Paging on-call…
NetSec
(Nicole)
This is production so I’ll have
to get others on the Network
CAB…
Chief of
Staff
Firewall
(Freddie)
Middleware
(Scott)
Customer outage!
… I’ll call SVP Susan
Middleware
Manager
VP
VP
Bar
Lead Dev
9:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAd
Middle
SVP
Chief o
2 x Tec
Ticket
Context Wagon
74. d VPs are livid… this was
sed to be a safe change!!
we’ve got customers calling.
ESCALATE:
Emergency
production firewall
rule change review
Ticket
Update
Firewall
Ticket
NetSec
+ add
Firewall Engineer
(Freddie)
Paging on-call…
NetSec
(Nicole)
This is production so I’ll have
to get others on the Network
CAB…
Chief of
Staff
Firewall
(Freddie)
Middleware
(Scott)
Customer outage!
… I’ll call SVP Susan
Middleware
Manager
VP
VP
Bar
Lead Dev
9:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAd
Middle
SVP
Chief o
2 x Tec
Ticket
Context Wagon
Extra
Process
75. d VPs are livid… this was
sed to be a safe change!!
we’ve got customers calling.
ESCALATE:
Emergency
production firewall
rule change review
Ticket
Update
Firewall
Ticket
NetSec
+ add
Firewall Engineer
(Freddie)
Paging on-call…
NetSec
(Nicole)
This is production so I’ll have
to get others on the Network
CAB…
Chief of
Staff
Firewall
(Freddie)
Middleware
(Scott)
Customer outage!
… I’ll call SVP Susan
Middleware
Manager
VP
VP
Bar
Lead Dev
9:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAd
Middle
SVP
Chief o
2 x Tec
Ticket
Context Wagon
Extra
Process
Escalation
76. d VPs are livid… this was
sed to be a safe change!!
we’ve got customers calling.
ESCALATE:
Emergency
production firewall
rule change review
Ticket
Update
Firewall
Ticket
NetSec
+ add
Firewall Engineer
(Freddie)
Paging on-call…
NetSec
(Nicole)
This is production so I’ll have
to get others on the Network
CAB…
Chief of
Staff
Firewall
(Freddie)
Middleware
(Scott)
Customer outage!
… I’ll call SVP Susan
Middleware
Manager
VP
VP
Bar
Lead Dev
9:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAd
Middle
SVP
Chief o
2 x Tec
Ticket
Context Wagon
Extra
Process
Escalation
Task
Switching
77. d VPs are livid… this was
sed to be a safe change!!
we’ve got customers calling.
ESCALATE:
Emergency
production firewall
rule change review
Ticket
Update
Firewall
Ticket
NetSec
+ add
Firewall Engineer
(Freddie)
Paging on-call…
NetSec
(Nicole)
This is production so I’ll have
to get others on the Network
CAB…
Chief of
Staff
Firewall
(Freddie)
Middleware
(Scott)
Customer outage!
… I’ll call SVP Susan
Middleware
Manager
VP
VP
Bar
Lead Dev
9:00pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAd
Middle
SVP
Chief o
2 x Tec
Ticket
Context Wagon
Extra
Process
Escalation
Task
Switching
Misaligned
Priorities
78. I’ll have
Network
Chief of
Staff
Firewall
(Freddie)
Middleware
(Scott)
Customer outage!
APPROVE: Emergency
firewall rule change
Ticket
Update
Firewall
Ticket
NetSec
(Nicole)
… I’ll call SVP Susan
Middleware
Manager
VP
VP
Bar
Lead Dev
Firewall
(Freddie)
Net L2
(Bob)
Middl
(Sc
Firewall
change
Restart Bar
9:30pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Firewall (Freddie)
Ticket
Context Wagon
NetSec (Nicole)
79. I’ll have
Network
Chief of
Staff
Firewall
(Freddie)
Middleware
(Scott)
Customer outage!
APPROVE: Emergency
firewall rule change
Ticket
Update
Firewall
Ticket
NetSec
(Nicole)
… I’ll call SVP Susan
Middleware
Manager
VP
VP
Bar
Lead Dev
Firewall
(Freddie)
Net L2
(Bob)
Middl
(Sc
Firewall
change
Restart Bar
9:30pm
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Firewall (Freddie)
Ticket
Context Wagon
NetSec (Nicole)
Waiting
89. e
Ticket
“APIs OK”
Middleware
(Scott)
Update
Ticket
Ticket
“Services
restarted OK”
NOC
NOC
Lights are green…
I guess it is fixed.
Close
Ticket
NOC
(Bob)
Zzz
11:30pm
N
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Firewall (Freddie)
Ticket
Context Wagon
NetSec (Nicole)
Cust. Engmt. (Varsha)
90. e
Ticket
“APIs OK”
Middleware
(Scott)
Update
Ticket
Ticket
“Services
restarted OK”
NOC
NOC
Lights are green…
I guess it is fixed.
Close
Ticket
NOC
(Bob)
Zzz
11:30pm
N
NOC (Bob)
Biz Manager
App Manager
Lead Dev (Karen)
Foo SRE
SysAdmin (Lee)
Middleware Manager
SVP
Chief of Staff
2 x Tech VP
Middleware (Scott)
Bar SRE (Linda)
Network PM (Carlos)
Network SRE (Bob)
Firewall (Freddie)
Ticket
Context Wagon
NetSec (Nicole)
Cust. Engmt. (Varsha)
.
91. NOC
Lights are green…
I guess it is fixed.
Close
Ticket
NOC
(Bob)
Zzz
Next Day
SVP
(Susan)
Whose fault is this?!
Why are we so bad at change?
What additional processes
and approvals are you
adding to never let this
happen again?!
VP
VP
Dir
Dir
VP
Dir
VP
Scott)
da)
Carlos)
(Bob)
die)
NetSec (Nicole)
Cust. Engmt. (Varsha)
93. We’ve invested in Cloud, Agile,
DevOps, Containers…
Why does everything still take too
long and cost too much?
Executive Team
Our transformation has
largely ignored Ops
96. “We need better tools”
…by following the conventional wisdom:
97. “We need better tools”
“We need more people”
…by following the conventional wisdom:
98. “We need better tools”
“We need more people”
“We need more discipline and attention to detail”
…by following the conventional wisdom:
99. “We need better tools”
“We need more people”
“We need more discipline and attention to detail”
“We need more change reviews/approvals”
…by following the conventional wisdom:
100. “We need better tools”
“We need more people”
“We need more discipline and attention to detail”
“We need more change reviews/approvals”
…by following the conventional wisdom:
“We’ll wait and see what ITIL v4 says”
101. “We need better tools”
“We need more people”
“We need more discipline and attention to detail”
“We need more change reviews/approvals”
…by following the conventional wisdom:
102. “We need better tools”
“We need more people”
“We need more discipline and attention to detail”
“We need more change reviews/approvals”
…by following the conventional wisdom:
109. All work is contextual
rm -rf $PATHNAME
John
Allspaw
110. All work is contextual
rm -rf $PATHNAME Is this dangerous?
John
Allspaw
111. All work is contextual
rm -rf $PATHNAME
John
Allspaw
112. All work is contextual
rm -rf $PATHNAME
John
Allspaw
113. All work is contextual
rm -rf $PATHNAME
Is this dangerous?
John
Allspaw
114. All work is contextual
rm -rf $PATHNAME
John
Allspaw
115. All work is contextual
rm -rf $PATHNAME
Answer is always
“it depends”
John
Allspaw
116. escalate
1° 2° 3° 4°
escalate escalateor
Context
Where are decisions made? Who can take action?
117. Psychological safety
Psychological safety is a shared belief that the team is safe for
interpersonal risk taking. It can be defined as "being able to show
and employ one's self without fear of negative consequences of
self-image, status or career.
- William Kahn
Boston University
1990
118. Psychological safety
Psychological safety is a shared belief that the team is safe for
interpersonal risk taking. It can be defined as "being able to show
and employ one's self without fear of negative consequences of
self-image, status or career.
- William Kahn
Boston University
1990
Google: most important characteristic
to predict team effectiveness?
2016
119. Psychological safety
Psychological safety is a shared belief that the team is safe for
interpersonal risk taking. It can be defined as "being able to show
and employ one's self without fear of negative consequences of
self-image, status or career.
- William Kahn
Boston University
1990
Google: most important characteristic
to predict team effectiveness?
2016
Psychological safety!
122. Toil: Name For a Problem We’ve All Felt
“Toil is the kind of work tied to running a production
service that tends to be manual, repetitive,
automatable, tactical, devoid of enduring value, and
that scales linearly as a service grows.”
-Vivek Rau
Google
123. Toil vs. Engineering Work
Toil Engineering Work
Lacks Enduring Value Builds Enduring Value
Rote, Repetitive Creative, Iterative
Tactical Strategic
Increases With Scale Enables Scaling
Can Be Automated Requires Human Creativity
124. Excessive Toil Prevents Fixing the System
Toil Engineering Work
E.W.Toil
Reduce toil
Improve the business ǡ
No capacity to reduce toil
No capacity to improve business
Toil at manageable percentage of capacity
Toil at unmanageable percentage of capacity (“Engineering Bankruptcy”)
125. Excessive Toil Prevents Fixing the System
Toil Engineering Work
E.W.Toil
Reduce toil
Improve the business ǡ
No capacity to reduce toil
No capacity to improve business
Toil at manageable percentage of capacity
Toil at unmanageable percentage of capacity (“Engineering Bankruptcy”)
126. Excessive Toil Prevents Fixing the System
Toil Engineering Work
E.W.Toil
Reduce toil
Improve the business ǡ
No capacity to reduce toil
No capacity to improve business
Toil at manageable percentage of capacity
Toil at unmanageable percentage of capacity (“Engineering Bankruptcy”)
Downward spiral is inevitable!
130. Backlog Information
I need X
PrioritiesTools
Silos
Backlog
I do X
Requests
for X
Silo A
Information
Priorities
Silo B
Tools
131. Silos cause disconnects and mismatches
Backlog Information
I need X
PrioritiesTools
Backlog
I do X
Requests
for X
Silo A
Information
Priorities
Silo B
Tools
Context
Context
Process
Process
Tooling
Tooling
Capacity
Capacity
133. How do we cover for our silos’ disconnects and mismatches?
Silo A Silo B
134. How do we cover for our silos’ disconnects and mismatches?
Silo A Silo B
Ticket
Queue
135. ??
Silo A Silo B
We all know how well that works
Ticket
Queue
136. Ticket queues are an expensive way to manage work
Ticket
Queue
Queues Create…
Longer Cycle Time
Increased Risk
More Variability
More Overhead
Lower Quality
Less Motivation
Adapted from Donald G. Reinertsen, The Principles of Product Development Flow: Second Generation Lean Product Development
146. “Shift Left” the ability to take action
escalate
1° 2° 3° 4°
escalate escalateor
147. “Shift Left” the ability to take action
Push the ability to take action this direction
escalate
1° 2° 3° 4°
escalate escalateor
148. “Shift Left” the ability to take action
Push the ability to take action this direction
escalate
1° 2° 3° 4°
escalate escalateor
Tools
Enablement and tooling
152. Reduce Toil
1. Track toil levels for each team
2. Set toil limits for each team
153. Reduce Toil
1. Track toil levels for each team
2. Set toil limits for each team
3. Fund efforts to reduce toil (with emphasis on teams over toil limits)
154. Reduce Toil
1. Track toil levels for each team
2. Set toil limits for each team
3. Fund efforts to reduce toil (with emphasis on teams over toil limits)
Bonus: Use Service Level Objectives, Error Budgets, and other lessons from SRE
156. Obvious: Get rid of as many silos as possible
Old Silo A Old Silo B Old Silo C Old Silo D
157. Old Silo A Old Silo B Old Silo C Old Silo D
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Obvious: Get rid of as many silos as possible
158. Old Silo A Old Silo B Old Silo C Old Silo D
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Obvious: Get rid of as many silos as possible
“Horizontal” shared
responsibility, not
everyone do everything!
159. Shared and dedicated responsibility is key
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Development Team 1
Development Team 2
Development Team n
SRE
Team
Clear handoff requirements
Error budget with consequences
“Netflix"
Model
“Google”
Model
160. Shared and dedicated responsibility is key
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Development Team 1
Development Team 2
Development Team n
SRE
Team
Clear handoff requirements
Error budget with consequences
“Netflix"
Model
“Google”
Model
161. Shared and dedicated responsibility is key
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Development Team 1
Development Team 2
Development Team n
SRE
Team
Clear handoff requirements
Error budget with consequences
“Netflix"
Model
“Google”
Model
Same
high-quality,
high-velocity
results!
162. But what about the cross-cutting concerns?
Old Silo A Old Silo B Old Silo C Old Silo D
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Specialist
Capabilities
Specialist
Capabilities
Specialist
Capabilities
163. But what about the cross-cutting concerns?
Old Silo A Old Silo B Old Silo C Old Silo D
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Specialist
Capabilities
Specialist
Capabilities
Specialist
Capabilities
Ticket
Queue
Ticket
Queue
Ticket
Queue
164. But what about the cross-cutting concerns?
Old Silo A Old Silo B Old Silo C Old Silo D
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Specialist
Capabilities
Specialist
Capabilities
Specialist
Capabilities
Ticket
Queue
Ticket
Queue
Ticket
Queue
Ticket
Queue
Ticket
Queue Ticket
Queue
166. Self-Service Operations: Turn handoffs into self-service
Self-Service Operations
On
Demand
On
Demand
On
Demand
On
Demand
Ops
(operates platform)
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
Ops
(embedded)Cross-Functional Product Team 1
Cross-Functional Product Team n Ops
(embedded)
Cross-Functional Product Team 2 Ops
(embedded)
167. Self-Service Operations: Works with any org model
Development Team 1
Development Team 2
Development Team n
Ops/SRE
Team
Self-Service Operations
On
Demand
On
Demand
On
Demand
On
Demand
Ops
(operates platform)
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
168. Development Team 1
Development Team 2
Ops/SRE
Team
Self-Service Operations
On
Demand
On
Demand
On
Demand
On
Demand
Ops
(operates platform)
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
Cross-Functional Product Team n Ops
(embedded)
But, what about security and compliance?
Build-in
Security
Here
Build-in
Compliance
Here
170. Are all tickets bad?
Ticket
System
No. Just use tickets for what they are good for
171. Are all tickets bad?
1.Documenting true problems/issues/exceptionsTicket
System
No. Just use tickets for what they are good for
172. Are all tickets bad?
1.Documenting true problems/issues/exceptions
2.Routing for necessary approvals
Ticket
System
No. Just use tickets for what they are good for
173. Are all tickets bad?
1.Documenting true problems/issues/exceptions
2.Routing for necessary approvals
Not as a general purpose work management system!
Ticket
System
No. Just use tickets for what they are good for
175. Strategy: Self-Service improves response times
https://youtu.be/USYrDaPEFtM
Jody Mulkey at DOES ‘15 SF
Services Monitoring Scripts/Tools Services Monitoring Scripts/ToolsServices Monitoring Scripts/Tools
DEV STAGE PROD
Dev & QA NOC/Ops Dev
Promote
approved
jobs
Self-Service Self-Service
Empower
176. Strategy: Self-Service improves response times
https://youtu.be/USYrDaPEFtM
Jody Mulkey at DOES ‘15 SF
Services Monitoring Scripts/Tools Services Monitoring Scripts/ToolsServices Monitoring Scripts/Tools
DEV STAGE PROD
Dev & QA NOC/Ops Dev
Promote
approved
jobs
Self-Service Self-Service
Empower
178. Strategy: Self-Service improves consistency &compliance
Shaun Norris at DOES ‘18 London
https://youtu.be/d5IMvK0YHTg
Optimized for compliance
• 86,000+ employees
• 60+ countries
• Highly regulated
179. Strategy: Self-Service improves consistency &compliance
Shaun Norris at DOES ‘18 London
https://youtu.be/d5IMvK0YHTg
Optimized for compliance
• 86,000+ employees
• 60+ countries
• Highly regulated
LOB #1
LOB #2 LOB #3
LOB …n
Services Scripts/Tools
Data Center
Services Scripts/Tools
Data Center
Services Scripts/Tools
Data Center Services Scripts/Tools
Cloud
Services Scripts/Tools
Cloud
Services Scripts/Tools
Cloud
Services Scripts/Tools
Cloud
Self-Service
ComplianceConsistency
180. Strategy: Self-Service improves consistency &compliance
Shaun Norris at DOES ‘18 London
https://youtu.be/d5IMvK0YHTg
Optimized for compliance
• 86,000+ employees
• 60+ countries
• Highly regulated
LOB #1
LOB #2 LOB #3
LOB …n
Services Scripts/Tools
Data Center
Services Scripts/Tools
Data Center
Services Scripts/Tools
Data Center Services Scripts/Tools
Cloud
Services Scripts/Tools
Cloud
Services Scripts/Tools
Cloud
Services Scripts/Tools
Cloud
Self-Service
ComplianceConsistency
12 months: 13,000+ ops tasks in privileged
environments that didn’t require a review
182. Recap
Don’t forget about Ops.
Challenge conventional wisdom.
Leverage the Self-Service
Operations design pattern
“Shift-Left” control and decision
making.
Old Silo A Old Silo B Old Silo C Old Silo D
Cross-Functional Team 1
Cross-Functional Team 2
Cross-Functional Team n
Focus on removing silos and
queues
Learn from SRE: Reduce toil to
create capacity to change
Toil Engineering Work
E.W.Toil
Reduce toil
Improve the business ǡ
No capacity to reduce toil
Toil at manageable percentage of capacity
oil at unmanageable percentage of capacity (“Engineering Bankruptcy”)
Understand the forces
undermining operations work
Development Team 1
Development Team 2
Ops/SRE
Team
Self-Service Operations
On
Demand
On
Demand
On
Demand
On
Demand
Ops
(operates platform)
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
Ops Capability
SRE, Dev, or
Specialist
Cross-Functional Product Team n Ops
(embedded)