Damon Edwards, co-founder of Rundeck, presents at Salt Lake City DevOps Meetup, November 13, 2019.
There is no doubt that DevOps has changed how we deliver software. But what about after deployment? Whether you are in a traditional operations organization or a “you build it, you run it” team, how do you mobilize, resolve, and learn from incidents? This talk will look at how high performing organizations have applied DevOps and SRE practices to shorten incidents and reduce escalations. Less frustration for the engineers. Lower costs for the business. Everybody wins.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
How to bootstrap an SRE team into your company. How to hire them, what to have them work on and how to interact with them as a team. Finally some thought on general practices to consider before your SREs arrive. There are also kitten pictures.
Managing a team and project are quite synonymous. Especially, teams require effective distribution of responsibility / roles. Once that is setup, a proper process guides people to make progress. All this fits into a product lifecycle, which is essential to develop the right product, in the right way, and deliver it at the right time.
<p>From <a href="https://en.wikipedia.org/wiki/Site_reliability_engineering" target="_blank">Wikipedia</a>: Site reliability engineering (SRE) is a discipline that incorporates aspects of software engineering and applies that to operations whose goals are to create ultra-scalable and highly reliable software systems.<p>
<p>Over the past year Acquia has built their own SRE team to help their products and services scale with the demand of our growing number of customers. We wish to share our experience so that others are enabled to do the same and reap the rewards.</p>
<p>This presentation will discuss how the SRE team came about at Acquia, what achievements we have made so far, and the lessons we have learned along the way. We will then show the steps on how to introduce SRE to your workplace so you can deliver more reliable and scalable services to your customers! We will specifically cover:</p>
<ul>
<li>SRE's basic concepts and history from Google</li>
<li>The management support you will need to get started</li>
<li>Introducing the idea of service level objectives and error budgets</li>
<li>Operational Responsibility Assessments as a tool to measure risk</li>
<li>Creating a Launch Readiness Checklist to standardize and improve product launches</li>
<li>Finding ideal candidates for your SRE team</li></ul>
<p>The intended audience are software engineers, system administrators, and managers that have a desire to improve how they do their work and how their products/services perform.</p>
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaKeet Sugathadasa
When it comes to Site Reliability Engineering, short for SRE, the resources available online are only limited to the books published by Google themselves. They do share some useful case studies that will help us understand what SRE is, and how to understand the concepts given in it, but they do not clearly explain how to build your own SRE team for your organization. The concept of SRE was cooked fresh within the walls of Google and later released to the general public as a practice for anyone to follow.
In this presentation I would like to give a brief introduction to SRE and why it is important to any Software Engineering organization. This is based on my experiences and learnings from leading a Site Reliability Engineering team for leading organizations in the US and Norway.
This presentation was conducted by me as a Tech Talk as an Associate Technical Lead at Creative Software Sri Lanka.
An overview of Google's Site Reliability Engineering with a view toward possible incorporation in the IEEE P2675 DevOps security standard. (Creative Commons with credit.)
Getting started with Site Reliability Engineering (SRE)Abeer R
"Getting started with Site Reliability Engineering (SRE): A guide to improving systems reliability at production"
This is an intro guide to share some of the common concepts of SRE to a non-technical audience. We will look at both technical and organizational changes that should be adopted to increase operational efficiency, ultimately benefiting for global optimizations - such as minimize downtime, improve systems architecture & infrastructure:
- improving incident response
- Defining error budgets
- Better monitoring of systems
- Getting the best out of systems alerting
- Eliminating manual, repetitive actions (toils) by automation
- Designing better on-call shifts/rotations
How to design the role of the Site Reliability Engineer (who effectively works between application development teams and operations support teams)
How to bootstrap an SRE team into your company. How to hire them, what to have them work on and how to interact with them as a team. Finally some thought on general practices to consider before your SREs arrive. There are also kitten pictures.
Managing a team and project are quite synonymous. Especially, teams require effective distribution of responsibility / roles. Once that is setup, a proper process guides people to make progress. All this fits into a product lifecycle, which is essential to develop the right product, in the right way, and deliver it at the right time.
<p>From <a href="https://en.wikipedia.org/wiki/Site_reliability_engineering" target="_blank">Wikipedia</a>: Site reliability engineering (SRE) is a discipline that incorporates aspects of software engineering and applies that to operations whose goals are to create ultra-scalable and highly reliable software systems.<p>
<p>Over the past year Acquia has built their own SRE team to help their products and services scale with the demand of our growing number of customers. We wish to share our experience so that others are enabled to do the same and reap the rewards.</p>
<p>This presentation will discuss how the SRE team came about at Acquia, what achievements we have made so far, and the lessons we have learned along the way. We will then show the steps on how to introduce SRE to your workplace so you can deliver more reliable and scalable services to your customers! We will specifically cover:</p>
<ul>
<li>SRE's basic concepts and history from Google</li>
<li>The management support you will need to get started</li>
<li>Introducing the idea of service level objectives and error budgets</li>
<li>Operational Responsibility Assessments as a tool to measure risk</li>
<li>Creating a Launch Readiness Checklist to standardize and improve product launches</li>
<li>Finding ideal candidates for your SRE team</li></ul>
<p>The intended audience are software engineers, system administrators, and managers that have a desire to improve how they do their work and how their products/services perform.</p>
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaKeet Sugathadasa
When it comes to Site Reliability Engineering, short for SRE, the resources available online are only limited to the books published by Google themselves. They do share some useful case studies that will help us understand what SRE is, and how to understand the concepts given in it, but they do not clearly explain how to build your own SRE team for your organization. The concept of SRE was cooked fresh within the walls of Google and later released to the general public as a practice for anyone to follow.
In this presentation I would like to give a brief introduction to SRE and why it is important to any Software Engineering organization. This is based on my experiences and learnings from leading a Site Reliability Engineering team for leading organizations in the US and Norway.
This presentation was conducted by me as a Tech Talk as an Associate Technical Lead at Creative Software Sri Lanka.
An overview of Google's Site Reliability Engineering with a view toward possible incorporation in the IEEE P2675 DevOps security standard. (Creative Commons with credit.)
Getting started with Site Reliability Engineering (SRE)Abeer R
"Getting started with Site Reliability Engineering (SRE): A guide to improving systems reliability at production"
This is an intro guide to share some of the common concepts of SRE to a non-technical audience. We will look at both technical and organizational changes that should be adopted to increase operational efficiency, ultimately benefiting for global optimizations - such as minimize downtime, improve systems architecture & infrastructure:
- improving incident response
- Defining error budgets
- Better monitoring of systems
- Getting the best out of systems alerting
- Eliminating manual, repetitive actions (toils) by automation
- Designing better on-call shifts/rotations
How to design the role of the Site Reliability Engineer (who effectively works between application development teams and operations support teams)
Site Reliability Engineer (SRE), We Keep The Lights On 24/7NUS-ISS
There are many phases in the software development cycle, from requirements to development and testing, but at the tail of the process, is an often overlooked aspect: deployment and delivery. With the paradigm shift of delivering on-site software to offering software-as-a-service, Site Reliability Engineering is beginning to take a greater role in product delivery.
This session aims to give a glimpse of the work that goes into site reliability engineering (SRE) and effort that goes into keeping a service going 24/7.
Service Level Terminology : SLA ,SLO & SLIKnoldus Inc.
Measuring outcomes is always at the top of our mind when approaching goals. While we do have specific targets we may be aiming for, circling back to confirm that the resulting outcome is in fact what you were after is extremely important. Small course corrections are required. Outcomes may be more general but often attract the attention and support of decision-makers earlier.
Key measurements and thresholds to hold us accountable for our efforts as well as communicate expectations across the entire organization needed to be established. Nearly every resource you find regarding site reliability engineering will talk about key metrics used to establish high-level objectives, indicators of the movement toward or away from those objectives, and ultimately what agreements are in place should objectives be unfulfilled.
SLIs will help us know how we are performing against our SLOs and our SLA will outline the consequences (good or bad) of meeting those objectives. Once we have data to observe, we will begin orienting ourselves to it and establish what we believe our SLIs and SLOs to be.
Here’s an outline of the webinar -
~ Learn what an SRE is and isn't.
~ Understand the difference between service-level indicators (SLI), service-level objectives (SLO), and service-level agreements (SLA).
~ Gain an understanding of error budgets and how to calculate reliability cost.
~ Learn how SREs can embed themselves within development teams to increase operational stability
SRE-iously: Defining the Principles, Habits, and Practices of Site Reliabilit...New Relic
No matter how you define it, the Site Reliability Engineer (SRE) role is clearly expanding into more and more companies. To be effective in this new role, SREs must possess a depth of understanding of how different systems work together, how they fail, how they can be improved, and how they can best be designed and monitored.
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailogjuljo
Full day workshop about Microservices Architectures, from the basis to advanced topics like Service Discovery, Load Balancing, Fault Tolerance and Centralized Logging.
Many technologies are involved, like Spring Cloud Netflix, Docker, Cloud Foundry and ELK.
A separate deck describes all the lab exercises.
Overview of Site Reliability Engineering (SRE) & best practicesAshutosh Agarwal
In any software organization, stability & innovation are always at loggerheads - the faster you move, the more things will break. This talk defines what SRE org looks like at high-tech organizations (Google, Uber).
Cloud Native Engineering with SRE and GitOpsWeaveworks
Site reliability engineering (SRE), a model championed by Google, is a software engineering approach to IT operations. For companies striving to become cloud native and adopting modern tools such as Kubernetes, SRE best practices are crucial for success.
In this webinar, Brice, one of our seasoned Customer Reliability Engineers will show how to design a fail-proof Kubernetes platform using tried and tested SRE and GitOps methods.
He will share best practices on:
Increasing performance and ensuring scalability
Managing incident responses through disaster recovery
Designing for High Availability in Kubernetes
Achieving 360 visibility and alerts for your platform
SRE-iously! Defining the Principles, Habits, and Practices of Site Reliabilit...Tori Wieldt
How do you make DevOps magic when you aren’t Google? This talk will help whether you’re still figuring out how to create a site reliability practice at your company or you’re trying to improve the processes and habits of an existing SRE team.
Comprehensive overview of using Test Driven Development (TDD), Behavior Driven Development (BDD), Continuous Integration (CI), Continuous Delivery (CD), Development Operations (DevOps), and Development Operations Security (DevOpsSec). Describes the current global environment, basic lean and agile principles, and the evolution of Microservices. From there, a detailed deep-dive of TDD, BDD, CI, CD, DevOps, and DevOpsSec principles and practices ensues. Closes by identifying key DevOps tool automation ecosystems/pipelines, metrics, case studies, return on investment (ROI)/business cases, implementation roadmaps, adoption statistics, leadership insights, and a summary. Contains a lot of helpful data for constructing DevOps strategic business cases as well as tactical implementation strategies (while not ignoring essential elements such as microservices, containerization, and application security).
How Google works and how can you benefit from it? Test drive now a complete Microservices application with Istio, gRPC, Redis, BigQuery, Spring Boot, Spring Cloud and Stackdriver on Google Cloud Platform: https://git.io/fhzCx
A high level introduction to DevOps. Explains what it is, how popular DevOps has become, why DevOps is popular, how DevOps differs from traditional approaches and some next steps to implementation.
Devops On Cloud Powerpoint Template Slides Powerpoint Presentation SlidesSlideTeam
Introducing DevOps On Cloud PowerPoint Template Slides PowerPoint Presentation Slides. Provide an overview of DevOps with this attention-grabbing PPT slideshow. This presentation helps to understand the need for DevOps, how it is different from traditional IT, DevOps use cases in business, lifecycle, roadmap, and so on. Provide an overview of how DevOps is different from agile by using the content-ready DevOps strategy PPT visuals. The slides also explain the roles, responsibilities, and skills of DevOps engineers. DevOps automation tools and DevOps roadmap for implementation in the organization can be discussed effectively. Provide an overview of DevOps on the cloud by describing cloud computing, characteristics of cloud computing, benefits, top risks related to cloud computing, etc. Cloud computing use cases and cloud deployment models can be presented with the help of visual attention-grabbing DevOps implementation roadmap PowerPoint slides. The roadmap to integrate cloud computing in business can be depicted easily by using the DevOps implementation strategy PowerPoint slideshow. https://bit.ly/3d8uYRY
SRE (service reliability engineer) on big DevOps platform running on the clou...DevClub_lv
SRE (service reliability engineer). The talk is to explain the SRE philosophy and the principles of production engineering and operations in clouds.
(Language – English)
Pavlo is ADOP (Accenture DevOps Platform) Service Reliability Team Lead, SRE practitioner. Has more then 18 years of IT experience in Ops and Dev.
Incident Management in the Age of DevOps and SRE Rundeck
Keynote presentation at DevOps Con Munich, December 3, 2019, presented by Damon Edwards, co-founder of Rundeck.
Responding to incidents has always been the core job of Operations. With the rise of DevOps and SRE, how Operations work gets done — and who is doing the work — is changing. This talk will look at how high-performing organizations are applying DevOps and SRE practices to shorten incidents and reduce escalations.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Incident Management in the Age of DevOps and SRE Rundeck
Presented by Damon Edwards, co-founder of Rundeck, at QCon San Francisco 2019.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Site Reliability Engineer (SRE), We Keep The Lights On 24/7NUS-ISS
There are many phases in the software development cycle, from requirements to development and testing, but at the tail of the process, is an often overlooked aspect: deployment and delivery. With the paradigm shift of delivering on-site software to offering software-as-a-service, Site Reliability Engineering is beginning to take a greater role in product delivery.
This session aims to give a glimpse of the work that goes into site reliability engineering (SRE) and effort that goes into keeping a service going 24/7.
Service Level Terminology : SLA ,SLO & SLIKnoldus Inc.
Measuring outcomes is always at the top of our mind when approaching goals. While we do have specific targets we may be aiming for, circling back to confirm that the resulting outcome is in fact what you were after is extremely important. Small course corrections are required. Outcomes may be more general but often attract the attention and support of decision-makers earlier.
Key measurements and thresholds to hold us accountable for our efforts as well as communicate expectations across the entire organization needed to be established. Nearly every resource you find regarding site reliability engineering will talk about key metrics used to establish high-level objectives, indicators of the movement toward or away from those objectives, and ultimately what agreements are in place should objectives be unfulfilled.
SLIs will help us know how we are performing against our SLOs and our SLA will outline the consequences (good or bad) of meeting those objectives. Once we have data to observe, we will begin orienting ourselves to it and establish what we believe our SLIs and SLOs to be.
Here’s an outline of the webinar -
~ Learn what an SRE is and isn't.
~ Understand the difference between service-level indicators (SLI), service-level objectives (SLO), and service-level agreements (SLA).
~ Gain an understanding of error budgets and how to calculate reliability cost.
~ Learn how SREs can embed themselves within development teams to increase operational stability
SRE-iously: Defining the Principles, Habits, and Practices of Site Reliabilit...New Relic
No matter how you define it, the Site Reliability Engineer (SRE) role is clearly expanding into more and more companies. To be effective in this new role, SREs must possess a depth of understanding of how different systems work together, how they fail, how they can be improved, and how they can best be designed and monitored.
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailogjuljo
Full day workshop about Microservices Architectures, from the basis to advanced topics like Service Discovery, Load Balancing, Fault Tolerance and Centralized Logging.
Many technologies are involved, like Spring Cloud Netflix, Docker, Cloud Foundry and ELK.
A separate deck describes all the lab exercises.
Overview of Site Reliability Engineering (SRE) & best practicesAshutosh Agarwal
In any software organization, stability & innovation are always at loggerheads - the faster you move, the more things will break. This talk defines what SRE org looks like at high-tech organizations (Google, Uber).
Cloud Native Engineering with SRE and GitOpsWeaveworks
Site reliability engineering (SRE), a model championed by Google, is a software engineering approach to IT operations. For companies striving to become cloud native and adopting modern tools such as Kubernetes, SRE best practices are crucial for success.
In this webinar, Brice, one of our seasoned Customer Reliability Engineers will show how to design a fail-proof Kubernetes platform using tried and tested SRE and GitOps methods.
He will share best practices on:
Increasing performance and ensuring scalability
Managing incident responses through disaster recovery
Designing for High Availability in Kubernetes
Achieving 360 visibility and alerts for your platform
SRE-iously! Defining the Principles, Habits, and Practices of Site Reliabilit...Tori Wieldt
How do you make DevOps magic when you aren’t Google? This talk will help whether you’re still figuring out how to create a site reliability practice at your company or you’re trying to improve the processes and habits of an existing SRE team.
Comprehensive overview of using Test Driven Development (TDD), Behavior Driven Development (BDD), Continuous Integration (CI), Continuous Delivery (CD), Development Operations (DevOps), and Development Operations Security (DevOpsSec). Describes the current global environment, basic lean and agile principles, and the evolution of Microservices. From there, a detailed deep-dive of TDD, BDD, CI, CD, DevOps, and DevOpsSec principles and practices ensues. Closes by identifying key DevOps tool automation ecosystems/pipelines, metrics, case studies, return on investment (ROI)/business cases, implementation roadmaps, adoption statistics, leadership insights, and a summary. Contains a lot of helpful data for constructing DevOps strategic business cases as well as tactical implementation strategies (while not ignoring essential elements such as microservices, containerization, and application security).
How Google works and how can you benefit from it? Test drive now a complete Microservices application with Istio, gRPC, Redis, BigQuery, Spring Boot, Spring Cloud and Stackdriver on Google Cloud Platform: https://git.io/fhzCx
A high level introduction to DevOps. Explains what it is, how popular DevOps has become, why DevOps is popular, how DevOps differs from traditional approaches and some next steps to implementation.
Devops On Cloud Powerpoint Template Slides Powerpoint Presentation SlidesSlideTeam
Introducing DevOps On Cloud PowerPoint Template Slides PowerPoint Presentation Slides. Provide an overview of DevOps with this attention-grabbing PPT slideshow. This presentation helps to understand the need for DevOps, how it is different from traditional IT, DevOps use cases in business, lifecycle, roadmap, and so on. Provide an overview of how DevOps is different from agile by using the content-ready DevOps strategy PPT visuals. The slides also explain the roles, responsibilities, and skills of DevOps engineers. DevOps automation tools and DevOps roadmap for implementation in the organization can be discussed effectively. Provide an overview of DevOps on the cloud by describing cloud computing, characteristics of cloud computing, benefits, top risks related to cloud computing, etc. Cloud computing use cases and cloud deployment models can be presented with the help of visual attention-grabbing DevOps implementation roadmap PowerPoint slides. The roadmap to integrate cloud computing in business can be depicted easily by using the DevOps implementation strategy PowerPoint slideshow. https://bit.ly/3d8uYRY
SRE (service reliability engineer) on big DevOps platform running on the clou...DevClub_lv
SRE (service reliability engineer). The talk is to explain the SRE philosophy and the principles of production engineering and operations in clouds.
(Language – English)
Pavlo is ADOP (Accenture DevOps Platform) Service Reliability Team Lead, SRE practitioner. Has more then 18 years of IT experience in Ops and Dev.
Incident Management in the Age of DevOps and SRE Rundeck
Keynote presentation at DevOps Con Munich, December 3, 2019, presented by Damon Edwards, co-founder of Rundeck.
Responding to incidents has always been the core job of Operations. With the rise of DevOps and SRE, how Operations work gets done — and who is doing the work — is changing. This talk will look at how high-performing organizations are applying DevOps and SRE practices to shorten incidents and reduce escalations.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Incident Management in the Age of DevOps and SRE Rundeck
Presented by Damon Edwards, co-founder of Rundeck, at QCon San Francisco 2019.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Keeping Your DevOps Transformation From Crushing Your Ops Capacity Rundeck
Presentation by Damon Edwards, co-founder of Rundeck, at DevOps Enterprise Summit in San Francisco, November 13, 2017
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Helping Ops Help You: Development’s Role in Enabling Self-Service OperationsRundeck
Presented by Damon Edwards, co-founder of Rundeck, at JAX DevOps and Finance London, April 5, 2017.
DevOps has provided plenty of lessons for how to speed up the pace of delivery and frequency of deployments. But, delivery and deployment only covers one part of the day-to-day life for developers in large enterprises.
What about what happens after deployment? In most cases, increasing the pace of delivery and frequency of deployment just increases the operational support load, work interrupts, and context switching that has always cut deeply into a development team’s time.
This talk focuses on the successful design patterns that high-performing, large scale organizations have applied to reduce the operational burden and support costs across their entire organization. Specifically, we’ll look at how they apply DevOps principles to improving the post-deployment lifecycle and how Developers play the key role in reducing the difficultly and cost of operations activity for everyone.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
2014-10 DevOps NFi - Why it's a good idea to deploy 10 times per day v1.0Joakim Lindbom
Corporations are struggling with overly complex systems and system landscapes. DevOps is presented as one piece of the puzzle to go for much leaner and simpler landscapes - all in order to increase the readiness for change and innovation.
The presentation also discusses the the basic thought error behind organising according to Design-Build-Run, which is the basis for most ICT IM outsourcing.
Teaching Elephants to Dance (and Fly!): A Developer's Journey to Digital Tran...Burr Sutter
We can be brilliant developers, but we won’t succeed—and won’t lead our organizations to succeed—without a new perspective (if you will) and new assumptions about the components of the “technology ecosystem” that are fundamentally critical to our success. This includes the operators, QA team, DBAs, security folks, and even the pure business contingent—in most cases, each of these individuals and groups plays a critical role in the success of what we create and give birth to as developers. What we do in isolation might be genius, but if we insulate ourselves—especially with arrogance—from these colleagues, neither our code nor our organizations will realize their full potential, and most will fail. The bottom line is that our old ways are no longer viable, and as the elite within our industry, we will be the leaders and heroes who discard old assumptions and adopt a new perspective in this exciting journey to digital transformation—where the impossible can become reality.
Self-Service Operations: Because Failure Still Happens (Developer Edition)Rundeck
Keynote presentation at DevNet Create 2017 by Damon Edwards, co-founder of Rundeck.
Agile and DevOps have provided plenty of lessons for how to speed up the pace of application delivery and the frequency of application deployment. But delivery and deployment only covers one part of the day-to-day life of developers in large enterprises. What about what happens after deployment? In many enterprises, increasing the pace of delivery and frequency of deployment has just increased the operational support load, work interrupts, and context switching that were already cutting deeply into development teams' time.
This talk will focus on the successful design patterns that high-performing, large scale organizations have applied to reduce the operational burden and support costs across their entire organization. Specifically, we’ll look at how they apply DevOps principles to improving the post-deployment lifecycle and how Developers play the key role in reducing the difficultly and cost of operations activity for everyone.
Herding Microservices – the Atlassian WayAtlassian
You know the story – your team needs to build a new integration for a new system and you decide, "A microservice would be perfect for this!" Before you know it, there are dozens of integrations and dozens of microservices spinning out of control.
In this session, Matej Konecny will walk you through his team's process for herding dozens of microservices at Atlassian. He'll share the processes they've put in place to manage and scale incident responses, make the service dependencies discoverable, and architect sustainable systems as they keep on building more integrations.
This is a presentation I gave to 100+ people at Rev1 Ventures in Columbus, OH. The presentation was about how to define DevOps. Like any new concept, there are multiple and sometimes competing definitions. I've found that implementations of DevOps can change but there are some very common anti-patterns. Lastly, I talk about how we implement DevOps at Bold Penguin.
How We end the Walking Dead in the Enterprise - Session Sponsored by VersentAmazon Web Services
We've all experienced it; attempt after attempt to bring contemporary transformation to the workplace, only to be attacked by 'the old guard' of the technology landscape. In 2016, companies are learning that many old methods/processes/technologies have zombified, or are already infected and dying. Old world solutions, powered by the fear of the new, the uncertainty of change and the doubt that it will scale, are quickly limiting companies by their inability to rapidly change.
However a growing numbers of CIO's, and leadership teams have embraced the 'full stack' revolution and are reaping the benefits of true Mode 2 transformation. We'd like to share some insight from real world customers who've built 'new world' material ecosystems that stands the rigour of these internal and external threats. Leaders and technologists who've leaned into change and avoided becoming another member of 'the walking dead'.
Topics Include:
Strategies to deliver elastic, utility cloud across your enterprise.
Square peg, round hole & your Operating Model – Mode 2.
Don't bypass Service Management – (hint: it's not faster). Tips & Tricks.
Belligerent, aggressive automation – JDI.
Tales from the Battlefield – a real world example.
Speaker: Thor Essman, CEO, Versent & James Coxon, GM, Cloud & Digital, Founder, Versent
“Serverless” can be defined as a couple simple things: 1 - It’s a programming model for structuring applications as functions and events (basically a manifestation of microservices). 2 - It’s a cloud business model, where use is billed by the function call instead of by the provisioned server, so apps only pay when they run and for how long they run, eliminating over-provisioning and typically reducing costs.
In this talk, we’ll cover the what, why and how of serverless, and learn more about it through running code.
Throughout the session, we’ll focus on how the serverless model is being leveraged in the real world - not just toy functions and demos. Legacy enterprise apps - which are typically monolithic, written by large teams of Java and .Net devs, and resembling a bit of a mud ball - are being shaved down to take advantage of serverless, and we’ll be sharing some early results from those efforts. We'll discuss examples of how Fortune 50 companies are building their serverless projects on the Kubernetes and Mesos clouds they have already deployed.
Le terme “Serverless” a plusieurs significations: 1 - un modèle de programmation pour structurer les applications en tant que fonctions et événements (essentiellement une manifestation de microservices); et 2 - Il s'agit d'un modèle d'entreprise Cloud, où l'utilisation est facturée par l'appel de fonction plutôt que par le serveur provisionné, de sorte que les applications ne paient que lorsqu'elles fonctionnent et pour combien de temps elles courent, éliminant le sur-provisionnement et réduisant les coûts associés.
Dans ce discours, nous allons couvrir le quoi, le pourquoi et comment de Serverless, et en savoir plus à ce sujet en exécutant le code. Nous nous concentrerons sur la façon dont le modèle Serverless est utilisé dans le monde réel - pas seulement les fonctions et démos. Les applications d'entreprise héritées - qui sont généralement monolithiques, écrites par de grandes équipes de développeurs Java et .Net et ressemblant à un peu une grande boule de boue - sont rasées pour profiter de Serverless, et nous partagerons des résultats préliminaires de ces efforts.
Self-Service Operations: Because Ops Still HappensRundeck
Keynote Presentation by Damon Edwards, co-founder of Rundeck, at DevOps Days Austin , May 4, 2017.
Deployment is a solved problem. Sure there is still work to be done, but the DevOps community has successfully proven that anyone can both scale deployment automation and distribute the capability to execute deployments. Now, we have to turn our attention to the next critical constraint: What happens after deployment?
We all know that failure is inevitable and is coming our way at any moment. How do we respond quickly and effectively to those failures? What works when there is just a small set of teams or an isolated system to manage will quickly break down when the organization grows in size and complexity. But on the other hand, what has been commonly practiced in large-scale enterprises is proving to be too cumbersome, too silo dependent, and simply too slow for today's business needs.
How do we rapidly respond to incidents and recover complex interdependent systems while working within an equally complex and interdependent organization? How do Ops teams embrace the DevOps and Agile inspired demand for speed while maintaining quality and control?
This talk examines the trial-and-error lessons learned by some forward-thinking enterprises who are currently streamlining how they:
-Resolve incidents
-Reduce friction between teams
-Divide up operational responsibilities
-Improve the quality of their ongoing operations (and organizational learning)
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Teaching Elephants to Dance (and Fly!) A Developer's Journey to Digital Trans...Burr Sutter
We can be brilliant developers, but we won’t succeed—and won’t lead our organizations to succeed—without a new perspective (if you will) and new assumptions about the components of the “technology ecosystem” that are fundamentally critical to our success. This includes the operators, QA team, DBAs, security folks, and even the pure business contingent—in most cases, each of these individuals and groups plays a critical role in the success of what we create and give birth to as developers. What we do in isolation might be genius, but if we insulate ourselves—especially with arrogance—from these colleagues, neither our code nor our organizations will realize their full potential, and most will fail. The bottom line is that our old ways are no longer viable, and as the elite within our industry, we will be the leaders and heroes who discard old assumptions and adopt a new perspective in this exciting journey to digital transformation—where the impossible can become reality.
Microservice Orchestration at any Scale - Zalando Tech Meetup 09/2017 Zeebe
Presentation given by Thorben Lindhauer and Daniel Meyer at Camunda Night at Zalando Inno Lab https://www.meetup.com/Zalando-Tech-Events-Berlin/events/242890035/
Damon Edwards, co-founder of Rundeck, presentation at Nexus Conf 2018 on how Security teams can help Operations and, in turn, help themselves.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
Making Observability Actionable At Scale - DBS DevConnect 2019Squadcast Inc
Many organisations already possess a vast amount of existing data about production systems. As customer expectations evolve, organisations are often challenged to find more proactive ways of dealing with traditionally reactive incident response activity. In this talk, we discuss approaches to unlock value from this data by making it truly actionable. Understanding production failure modes better, enriching technical and business context effectively, decomposing response activity into shared primitives, actions and workflows, and overall, sharing and augmenting this active knowledge repository on a continuous basis are key takeaways. Through case studies, we'll discuss how we can accomplish this by engineering your observability processes and tooling to work for human-in-the-loop interpretation and response rather than a purely human-reliant strategy.
Rundeck Community Office Hours: Using Variables with Job Steps Rundeck
Rundeck offers powerful runbook automation. Most Runbooks are complicated multi-step processes. We will show various examples of how to share data from one step to another through the use of Log Filters.
Come join this session to learn how to:
Use different types of Log Filters to gather variables from your Job Steps
Gather variables and use the values in other Job Steps
Use the Result Data feature to format your output in a consistent format regardless of the log output.
Most of what Rundeck does is via one of it’s plugins. There are already over 100+ plugins to perform various services including executing commands on nodes, performing step in a workflow, or sending notification about job status. There may be instances where you need to write your own plugin to perform a specific step or action. In this session, will walk through the steps for writing our own plugin.
In this session you'll learn:
Review the structure of plugin
How to use the structure and what information you need to include in other files to make your plugin work
How to write a simple plugin example using java
How to reply and use your plugin
Lunch and learn: Getting started with Rundeck & AnsibleRundeck
Operations teams depend on a mixture of tools to keep their systems running. One popular pairing for Rundeck users is integrating Ansible playbooks into Rundeck to orchestrate and schedule workflows across multiple tools.
Join us for this Lunch and Learn event to learn how you can use Rundeck to create runbooks that span your existing Ansible playbooks -- as well as any other scripts, tools, APIs, or systems commands, to respond to incidents or perform Operations tasks.
Join us to learn:
Benefits of using Rundeck and Ansible together
How to configure your Rundeck to use the Ansible plugin
Tips for getting started with the integration
And see a demo of the integration
This event is recommended for beginners.
Self Service Cloud Operations: Safely Delegate the Management of your Cloud ...Rundeck
Running Operations is not an easy job, especially these days. Ops teams have to ensure excellent user experiences, resolve incidents quickly and help developers stay productive. Yet at the same time, there is also the need to maintain systems security and keep downtime to a minimum.
While advances in cloud computing have helped address some of these challenges, many organizations find it difficult to leverage the cloud at scale because of bottlenecks that form around repetitive tasks, such as developers having to wait for provisioning infrastructure. Despite having access to abundant cloud resources, these speedbumps often make it difficult to achieve team objectives.
Join this talk to learn:
How to safely delegate the management of your cloud deployment (to developers and other end users) with self-service operations.
How to create powerful runbooks with guardrails that leverage existing scripting languages, infrastructure, and tools to remove bottlenecks that form around repetitive tasks.
Strategies for getting started with self-service.
Rundeck Office Hours: Best Practices Access Control PoliciesRundeck
Join us this month for an AMA discussion followed by a live Q&A led by technical experts from Rundeck’s engineering, product, and solution engineering teams. Experts are available to provide advice on your technical architecture, give recommendations for operational best practices, review current Github issues, or dive into the open source code itself.
Don’t miss the opportunity to learn Rundeck product best practices and ask experts your questions about Rundeck.
https://www.rundeck.com/rundeck-office-hours
Secure IT infrastructure is well protected by access keys, passwords, and other credentials. Admins need these secrets to gain access, as does any automation executed by Rundeck. Rundeck has rich support for secrets management with native key storage, as well as integrations with best-of-breed standardized solutions. In this webinar, we’ll cover best practices for working with Rundeck’s runbook automation platform in securing IT infrastructure. We’ll explore the secrets management options in Rundeck and we’ll highlight a new plugin with Thycotic Secret Server for Privileged Access Management.
In this webinar, we will demonstrate:
How Rundeck works with underlying secrets of the systems it manages
New Rundeck plugins that allow users to protect privileged accounts with enterprise-grade, privileged access management solutions
How you can use Rundeck plugins with HashiCorp Vault, Thycotic, and CyberArk as keys for jobs and other Rundeck configurations
In this session we will give a live walkthrough covering new capabilities released in Rundeck 3.4. Learn about security & compliance improvements we’ve made including the ability to organize secrets management by project -- so now each Runbook can access a different set of passwords and keys for its access control list (ACL). We also have a new plug-in for Thycotic users to manage secrets. Rundeck 3.4 now allows for queueing of jobs when those jobs must be run serially. Finally, we’ll discuss our vision for the future of Rundeck, and our primary development themes for the next year.
Automate Yourself Out of a Job: Safely Delegate the Management of your Azure...Rundeck
Running Operations is not an easy job, especially these days. Ops teams have to ensure excellent user experiences, resolve incidents quickly and help developers stay productive. Yet at the same time, there is also the need to maintain systems security and keep downtime to a minimum - goals which many struggle with at scale.
While advances in cloud computing have helped address some of these challenges, many organizations find it difficult to leverage the cloud at scale because of bottlenecks that form around repetitive tasks, such as developers having to wait for provisioning infrastructure. Despite having access to abundant cloud resources, these speedbumps often make it difficult or impossible to achieve team objectives.
Join this talk to learn:
-How to safely delegate the management of your Azure deployment (to developers and other colleagues) with self-service operations.
-How to create powerful runbooks with guardrails that leverage existing scripting languages (including PowerShell), infrastructure, and tools to remove the human from the bottleneck that forms around repetitive tasks.
-Strategies for getting started
-And how to create an Easy Button to handle the repetitive tasks that are interrupting your flow of work.
As presented by Jesse Houldsworth at PowerShell + DevOps Global Summit 2021
Super-Charge Your Site Reliability Practices with Runbook Automation Rundeck
On Demand Viewing: https://www.rundeck.com/super-charge-reliability
To win in today’s digital age, organizations need to balance product reliability and feature delivery with dynamic business needs and legacy and multi-cloud environments. Automation, as a main SRE practice, scales product reliability practices by reducing tedious tasks related to production operations, freeing up engineers to work on innovation.
Whether you are in a traditional operations organization or a “you build it, you run it” team, this webinar will explore strategies for increasing automation to improve your Operations so you can continue to create excellent experiences for your customers.
-How you can reduce MTTR and eliminate toil with Self-Service Operations
-Common workflow challenges and opportunities
-How you can use Runbook Automation to enable Self-Service Operations
-Ways to leverage existing assets and workflows by integrating Rundeck with existing toolsets
-See a demo of real world cases
https://youtu.be/4jAf6cbxsgo
As operators, it’s our job to monitor infrastructure, systems and applications and only wake up humans for tasks machines can’t fix on their own. Automated remediation pairs monitoring and runbook automation, giving you a monitoring system that can trigger operational actions with runbook automation to shorten incident response times and avoid alert fatigue.
Rundeck Director of Product Management Forrest Evans and Sensu Developer Advocate Todd Campbell discuss the key role automated remediation plays in the monitoring journey, with live demos of both the Rundeck and Sensu integrations. You’ll learn all about monitoring as code workflows with the Sensu Observability Pipeline and how to deliver runbook automation with Rundeck — and see how the two together can help you achieve automated remediation.
Failure is inevitable. But are you incurring more downtime and disruption than necessary? Legacy incident response techniques have difficulty keeping up with the increasing pace of change and skyrocketing complexity of today’s application environments.
During this webinar, you’ll learn about modern incident response techniques that can dramatically shorten incidents and reduce escalations. Join the experts from Rundeck and PagerDuty as they share:
*How a real-time operations platform intelligently manages alerts and on-call mobilization, delivering the right people the right information at the right time
*How runbook automation gives front-line response teams self-service access to run automated workflows – or runbooks – that diagnose and resolve incidents without escalating to an expert.
*How to automatically detect, diagnose, and resolve incidents without human intervention.
https://youtu.be/9yYwTPMRSOY
Nathan Fluegel, head of Customer Success at Rundeck, talks clustering and high availability. We'll show how to deploy Rundeck servers in a clustered configuration with Rundeck Enterprise.
https://youtu.be/PmBIGP3M9sI
Understand how to migrate your Rundeck environment from the community edition to Enterprise, including the pros and cons of each migratory approach.
In this webinar, you will learn how to:
-Determine which migration approach is most appropriate for your environment
-Shift from a single-server to clustered environment
-Migrate jobs and projects while keeping a clean install
Business Continuity for Humans: Keeping Your Business Running When Your Peopl...Rundeck
Damon Edwards (Rundeck) presentation from TechStrongConf on June 4, 2020.
Learn more: https://www.rundeck.com/business-continuity-for-digital-operations
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
6. What Is an Incident?
An unplanned disruption impacting
customers or business operations
7. What Is an Incident?
An unplanned disruption impacting
customers or business operations
Outages
Service Degradation
8. What Is an Incident?
An unplanned disruption impacting
customers or business operations
Outages
Service Degradation
Work interruption
Delay/Waiting
“Short-Notice” Requests
24. Adrian Cockcroft
Developer
Developer
Developer
Developer
Developer
Old Release Still
Running
Release Plan
Release Plan
Release Plan
Release Plan
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Deploy
Feature to
Production
Bugs
Deploy
Feature to
Production
Immutable microservice deployment
scales, is faster with large teams and
diverse platform components
DockerCon EU 2014 Architecture enables speed.
Speed is the advantage.
37. 1. SRE needs Service Level Objectives, with consequences
2. SREs have time to make tomorrow better than today
3. SRE teams have the ability to regulate their workload
Principles of SRE
38. 1. SRE needs Service Level Objectives, with consequences
2. SREs have time to make tomorrow better than today
3. SRE teams have the ability to regulate their workload
Principles of SRE
39. 1. SRE needs Service Level Objectives, with consequences
2. SREs have time to make tomorrow better than today
3. SRE teams have the ability to regulate their workload
Principles of SRE
74. Why?
Why?
Why?
Why?
Why?
There is no root cause.
(That’s just a political distinction)
Right,
Wrong,
Safety II,
and You.
Incidents = unplanned investments
REDeploy.io