All Day DevOps: Calling Out A Terrible On-Call SystemMolly Struve
Back when our team was small, all the devs participated in a single on-call rotation. As our team started to grow, that single rotation became problematic. Eventually, the team was so big that people were going on-call every 2-3 months. This may seem like a dream come true, but in reality, it was far from it. Because shifts were so infrequent, devs did not get the on-call experience they needed to know how to handle on-call issues confidently. Morale began to suffer and on-call became something everyone dreaded.
We knew the system had to change if we wanted to continue growing and not lose our developer talent, but the question was how? Despite all of the developers working across a single application with no clearly defined lines of ownership, we devised a plan that broke our single rotation into 3 separate rotations. This allowed teams to take on-call ownership over smaller pieces of the application while still working across all of it. These individual rotations paid off in many different ways.
With a new sense of on-call ownership, the dev teams began improving alerting and monitoring for their respective systems. The improved alerting led to faster incident response because the monitoring was better and each team was more focused on a smaller piece of the system. In addition, having 3 devs on-call at once means no one ever feels alone because there are always 2 other people who are on-call with you. Finally, cross-team communication and awareness also drastically improved with the new system.
Deja vu Security CEO Adam Cecchetti was invited to present the keynote speech at this year's (sold-out!) Hushcon in Seattle. Rich in humorous anecdotes and practical analysis, Test For Echo explores the relationship between time, ken, and the future of computer security.
'10 Great but now Overlooked Tools' by Graham ThomasTEST Huddle
The idea for this presentation came directly from EuroSTAR 2011. Sitting on the bus back to the conference centre after attending the Gala Dinner, a discussion started, about industry luminaries who turn up at conferences and give presentations which roughly say "Don't do all the stuff that I told you to do 5 years ago! Do this stuff now." But, but, but . . . .
As we got talking I realised how many simple effective tools I no longer used, because they have either become overlooked, forgotten and thus fallen into disuse, or because modern methods claim not to need them and they are redundant. I wondered if any of them were worth looking at again - starting with my trusty flowcharting template; I realised it is a great tool which I have overlooked for far too long!
Here is my list of 10 great but now overlooked tools:
• Flowcharts
• Prototypes
• Project Plans
• Mind Maps
• Tools we already have at our disposal like ....
• Aptitude Tests
• Hexadecimal Calculators
• Desk Checking
• Data Dictionaries and Workbenches
This is my list of really useful tools that I think are overlooked. In the webinar I will outline each tool, why I think it was great, and what we are missing out by not using it.
And it naturally follows that if there are some tools we have overlooked then there are also some tools that we should get rid of! I will identify some.
Hopefully this webinar will give you a different perspective on tools to use for testing, some tools that may be improved upon or plain discarded, and help you think about the tools you currently use and maybe to view them in a different light.
Future of software development - Danger of OversimplificationJon Ruby
A talk that was given at the Servoy World conference https://servoy.com/servoyworld2017/ on some perspectives for the future of the software development industry
Graham Thomas - Software Testing Secrets We Dare Not Tell - EuroSTAR 2013TEST Huddle
EuroSTAR Software Testing Conference 2013 presentation on Software Testing Secrets We Dare Not Tell by Graham Thomas.
See more at: http://conference.eurostarsoftwaretesting.com/past-presentations/
All Day DevOps: Calling Out A Terrible On-Call SystemMolly Struve
Back when our team was small, all the devs participated in a single on-call rotation. As our team started to grow, that single rotation became problematic. Eventually, the team was so big that people were going on-call every 2-3 months. This may seem like a dream come true, but in reality, it was far from it. Because shifts were so infrequent, devs did not get the on-call experience they needed to know how to handle on-call issues confidently. Morale began to suffer and on-call became something everyone dreaded.
We knew the system had to change if we wanted to continue growing and not lose our developer talent, but the question was how? Despite all of the developers working across a single application with no clearly defined lines of ownership, we devised a plan that broke our single rotation into 3 separate rotations. This allowed teams to take on-call ownership over smaller pieces of the application while still working across all of it. These individual rotations paid off in many different ways.
With a new sense of on-call ownership, the dev teams began improving alerting and monitoring for their respective systems. The improved alerting led to faster incident response because the monitoring was better and each team was more focused on a smaller piece of the system. In addition, having 3 devs on-call at once means no one ever feels alone because there are always 2 other people who are on-call with you. Finally, cross-team communication and awareness also drastically improved with the new system.
Deja vu Security CEO Adam Cecchetti was invited to present the keynote speech at this year's (sold-out!) Hushcon in Seattle. Rich in humorous anecdotes and practical analysis, Test For Echo explores the relationship between time, ken, and the future of computer security.
'10 Great but now Overlooked Tools' by Graham ThomasTEST Huddle
The idea for this presentation came directly from EuroSTAR 2011. Sitting on the bus back to the conference centre after attending the Gala Dinner, a discussion started, about industry luminaries who turn up at conferences and give presentations which roughly say "Don't do all the stuff that I told you to do 5 years ago! Do this stuff now." But, but, but . . . .
As we got talking I realised how many simple effective tools I no longer used, because they have either become overlooked, forgotten and thus fallen into disuse, or because modern methods claim not to need them and they are redundant. I wondered if any of them were worth looking at again - starting with my trusty flowcharting template; I realised it is a great tool which I have overlooked for far too long!
Here is my list of 10 great but now overlooked tools:
• Flowcharts
• Prototypes
• Project Plans
• Mind Maps
• Tools we already have at our disposal like ....
• Aptitude Tests
• Hexadecimal Calculators
• Desk Checking
• Data Dictionaries and Workbenches
This is my list of really useful tools that I think are overlooked. In the webinar I will outline each tool, why I think it was great, and what we are missing out by not using it.
And it naturally follows that if there are some tools we have overlooked then there are also some tools that we should get rid of! I will identify some.
Hopefully this webinar will give you a different perspective on tools to use for testing, some tools that may be improved upon or plain discarded, and help you think about the tools you currently use and maybe to view them in a different light.
Future of software development - Danger of OversimplificationJon Ruby
A talk that was given at the Servoy World conference https://servoy.com/servoyworld2017/ on some perspectives for the future of the software development industry
Graham Thomas - Software Testing Secrets We Dare Not Tell - EuroSTAR 2013TEST Huddle
EuroSTAR Software Testing Conference 2013 presentation on Software Testing Secrets We Dare Not Tell by Graham Thomas.
See more at: http://conference.eurostarsoftwaretesting.com/past-presentations/
A new hope for 2023? What developers must learn nextSteve Poole
Over the last ten years, we’ve seen cybercrime accelerate beyond all comprehension and the growing and relentless impact it has on our society and economies. It’s taken a long time for the world to act, but finally, we’re coming together to resist this uniquely 21st-century evil.
At the heart of the resistance are developers. Whatever role you have, whatever programming language or software you use - the battle is at your door.
In this session, we’ll brief you on the state of the situation and what you can do to be more prepared.: we’ll look at the bad guys and how they operate, examine recent legal and government responses and, most importantly, how the software industry is working together to create the tools, frameworks and education needed to help us all become the developers we need to be.
Feedback loops between tooling and cultureChris Winters
Discussion of how tools technologists create impact culture, and how culture impacts those tools. Not really a standalone presentation but hopefully useful.
Deja vu security Adam Cecchetti - Security is a Snapshot in Time BSidesPDX ...adamdeja
As the air gap between our daily lives and the Internet continues to shrink the security of our personal data and devices grows in importance. We are facing the daily threat of putting 2000s era computers bolted to toasters online while expecting them to defend against 2017 capable attackers. This talk will explore the continuing trend of IoT, discuss how we’ve been here before, and layout strategies for keeping pace with attackers in the future. This talk will focus on enumerating this risk, discuss the challenges involved, and explore solutions.
First, we will examine this history of how we got here, and what it means to say “security is a snapshot in time.” We then introduce the idea of shared ken – the range of one’s knowledge or sight – and how it impacts security. Third, we discuss the influence of data as code, the meta game, and secrecy as a way of mastering impact and ken.
This talk will allow attendees to walk away with
A holistic view of the history of computer security and how it impacts them today
The importance of extending the range of collective vision to reduce blind spots
Practical advice for BSiders to grow their mindset and improve their impact
Adam is a founding partner and Chief Executive Officer at Deja vu Security. He is dedicated to the leadership and relentless innovation in Deja’s products and services. Previously he has lead teams conducting application and hardware penetration tests for the Fortune 500 technology firms. Adam is a contributing author to multiple security books, benchmarks, tools, and DARPA research projects. Adam holds a degree in Computer Science and a Masters from Carnegie Mellon University in Information Networking.
Whether you are a big, sprawling MNC or a sleak, sexy start-up, zombie software will quickly invade your product platform. This deck is meant to start a conversation on how our industry can fight the zombies.
Thierry de Pauw - Feature Branching considered Evil - Codemotion Milan 2018Codemotion
With DVCSs branch creation became very easy, but it comes at a certain cost. Long living branches break the flow of the software delivery process, impacting stability and throughput. The session explores why teams are using feature branches, what problems are introduced by using them and what techniques exist to avoid them altogether. It explores exactly what's evil about feature branches, which is not necessarily the problems they introduce - but rather, the real reasons why teams are using them. After the session, you'll understand a different branching strategy and how it relates to CI/CD.
Agile is a 4 letter word - dev nexus 2020Jen Krieger
Based on a wide variety of surveys taken over recent years, many companies are transitioning to something that looks like Agile, whether they use that term or not. However, that transition doesn’t necessarily mean implementations have been done while respecting the Agile Manifesto and the principles behind it.
When going into the development of a software product, a possible source of mistake is the incorrect evaluation of the complexity that lies behind an idea , as well as a clutter coming from the massive amounts of technologies enabled. This presentation explains a possible way to deal with such issues.
Engineers tend to start most of the technology startups. While this gives them an inherent advantage as far as engineering the product goes, it also tends to put them at a disadvantage when it comes to designing (non-technically) and commercializing the product.
This slide deck takes up the key concepts from PdM that apply to startup-mode products. This is not a case for having Product Managers onboard, 80% of the startups don’t need a dedicated PM.
Towards the end, it introduces the funky concept of Product Entropy.
Putting Devs On-Call: How to Empower Your TeamVictorOps
A main tenet of DevOps is bridging the gap between the Dev team and the Ops team. One way to accomplish this is to include devs in the on-call rotation. While this may sound difficult, it’s not impossible to do…as our guide demonstrates.
We profile four companies that have successfully transitioned their dev team to being on-call and their stories can provide examples for how you too can do it.
OSDC 2019 | Feature Branching considered Evil by Thierry de PauwNETWAYS
With DVCSs, branch creation became very easy, but it comes at a certain cost. Long living branches break the flow of the software delivery process, impacting stability and throughput. The session explores why teams are using feature branches, what problems are introduced by using them and what techniques exist to avoid them altogether. It explores exactly what’s evil about feature branches, which is not necessarily the problems they introduce – but rather, the real reasons why teams are using them. After the session, you’ll understand a different branching strategy and how it relates to CI/CD.
Slides from my DevOpsExpo London talk "From oops to NoOps".
They tell you in these conferences that DevOps is not about tools, but about culture. And they are partially right. I am going to tell you that it’s not only about culture or tools but also abstractions.
It is a lot about how you see software and its value. About our mental model of what software is: how it runs, evolves, and interacts with the other facets of an enterprise.
We used to view software as code. As a state of code. Now we think about software as change, as a flow. A dynamic system where people, machines, and processes interact continuously.
At Platform.sh we spend a bunch of time asking ourselves not “How do you build?” - or even “How do you build consistently?” - but rather “What does it mean to consistently build in a world where change is good?” A world that lets you push security fixes into production as soon as they’re available because you don’t want to be an Equifax but you do want stability.
In this presentation, I will go over what we think software is and why having the right ideas about software will help you get your culture right and your tooling aligned, as well as gain in productivity, and general happiness and well-being.
A new hope for 2023? What developers must learn nextSteve Poole
Over the last ten years, we’ve seen cybercrime accelerate beyond all comprehension and the growing and relentless impact it has on our society and economies. It’s taken a long time for the world to act, but finally, we’re coming together to resist this uniquely 21st-century evil.
At the heart of the resistance are developers. Whatever role you have, whatever programming language or software you use - the battle is at your door.
In this session, we’ll brief you on the state of the situation and what you can do to be more prepared.: we’ll look at the bad guys and how they operate, examine recent legal and government responses and, most importantly, how the software industry is working together to create the tools, frameworks and education needed to help us all become the developers we need to be.
Feedback loops between tooling and cultureChris Winters
Discussion of how tools technologists create impact culture, and how culture impacts those tools. Not really a standalone presentation but hopefully useful.
Deja vu security Adam Cecchetti - Security is a Snapshot in Time BSidesPDX ...adamdeja
As the air gap between our daily lives and the Internet continues to shrink the security of our personal data and devices grows in importance. We are facing the daily threat of putting 2000s era computers bolted to toasters online while expecting them to defend against 2017 capable attackers. This talk will explore the continuing trend of IoT, discuss how we’ve been here before, and layout strategies for keeping pace with attackers in the future. This talk will focus on enumerating this risk, discuss the challenges involved, and explore solutions.
First, we will examine this history of how we got here, and what it means to say “security is a snapshot in time.” We then introduce the idea of shared ken – the range of one’s knowledge or sight – and how it impacts security. Third, we discuss the influence of data as code, the meta game, and secrecy as a way of mastering impact and ken.
This talk will allow attendees to walk away with
A holistic view of the history of computer security and how it impacts them today
The importance of extending the range of collective vision to reduce blind spots
Practical advice for BSiders to grow their mindset and improve their impact
Adam is a founding partner and Chief Executive Officer at Deja vu Security. He is dedicated to the leadership and relentless innovation in Deja’s products and services. Previously he has lead teams conducting application and hardware penetration tests for the Fortune 500 technology firms. Adam is a contributing author to multiple security books, benchmarks, tools, and DARPA research projects. Adam holds a degree in Computer Science and a Masters from Carnegie Mellon University in Information Networking.
Whether you are a big, sprawling MNC or a sleak, sexy start-up, zombie software will quickly invade your product platform. This deck is meant to start a conversation on how our industry can fight the zombies.
Thierry de Pauw - Feature Branching considered Evil - Codemotion Milan 2018Codemotion
With DVCSs branch creation became very easy, but it comes at a certain cost. Long living branches break the flow of the software delivery process, impacting stability and throughput. The session explores why teams are using feature branches, what problems are introduced by using them and what techniques exist to avoid them altogether. It explores exactly what's evil about feature branches, which is not necessarily the problems they introduce - but rather, the real reasons why teams are using them. After the session, you'll understand a different branching strategy and how it relates to CI/CD.
Agile is a 4 letter word - dev nexus 2020Jen Krieger
Based on a wide variety of surveys taken over recent years, many companies are transitioning to something that looks like Agile, whether they use that term or not. However, that transition doesn’t necessarily mean implementations have been done while respecting the Agile Manifesto and the principles behind it.
When going into the development of a software product, a possible source of mistake is the incorrect evaluation of the complexity that lies behind an idea , as well as a clutter coming from the massive amounts of technologies enabled. This presentation explains a possible way to deal with such issues.
Engineers tend to start most of the technology startups. While this gives them an inherent advantage as far as engineering the product goes, it also tends to put them at a disadvantage when it comes to designing (non-technically) and commercializing the product.
This slide deck takes up the key concepts from PdM that apply to startup-mode products. This is not a case for having Product Managers onboard, 80% of the startups don’t need a dedicated PM.
Towards the end, it introduces the funky concept of Product Entropy.
Putting Devs On-Call: How to Empower Your TeamVictorOps
A main tenet of DevOps is bridging the gap between the Dev team and the Ops team. One way to accomplish this is to include devs in the on-call rotation. While this may sound difficult, it’s not impossible to do…as our guide demonstrates.
We profile four companies that have successfully transitioned their dev team to being on-call and their stories can provide examples for how you too can do it.
OSDC 2019 | Feature Branching considered Evil by Thierry de PauwNETWAYS
With DVCSs, branch creation became very easy, but it comes at a certain cost. Long living branches break the flow of the software delivery process, impacting stability and throughput. The session explores why teams are using feature branches, what problems are introduced by using them and what techniques exist to avoid them altogether. It explores exactly what’s evil about feature branches, which is not necessarily the problems they introduce – but rather, the real reasons why teams are using them. After the session, you’ll understand a different branching strategy and how it relates to CI/CD.
Slides from my DevOpsExpo London talk "From oops to NoOps".
They tell you in these conferences that DevOps is not about tools, but about culture. And they are partially right. I am going to tell you that it’s not only about culture or tools but also abstractions.
It is a lot about how you see software and its value. About our mental model of what software is: how it runs, evolves, and interacts with the other facets of an enterprise.
We used to view software as code. As a state of code. Now we think about software as change, as a flow. A dynamic system where people, machines, and processes interact continuously.
At Platform.sh we spend a bunch of time asking ourselves not “How do you build?” - or even “How do you build consistently?” - but rather “What does it mean to consistently build in a world where change is good?” A world that lets you push security fixes into production as soon as they’re available because you don’t want to be an Equifax but you do want stability.
In this presentation, I will go over what we think software is and why having the right ideas about software will help you get your culture right and your tooling aligned, as well as gain in productivity, and general happiness and well-being.
Similar to LeadDev NYC 2022: Calling Out a Terrible On-call System (20)
Elasticsearch 5 and Bust (RubyConf 2019)Molly Struve
Breaking stuff is part of being a developer, but that never makes it any easier when it happens to you. The Elasticsearch outage of 2017 was the biggest outage our company has ever experienced. We drifted between full-blown downtime and degraded service for almost a week. However, it taught us a lot about how we can better prepare and handle upgrades in the future. It also bonded our team together and highlighted the important role teamwork and leadership plays in high-stress situations. The lessons learned are ones that we will not soon forget. In this talk, I will share those lessons and our story in hopes that others can learn from our experiences and be better prepared when they execute their next big upgrade.
Creating a Scalable Monitoring System That Everyone Will Love ADDOMolly Struve
A year ago, my company's monitoring setup was a disaster! We had 6 different monitoring tools sending alerts all over the place. In this talk, I will share how we overhauled our entire monitoring system and created a single, centralized, easy to use system that fits all of our needs. Not only does it fit our needs, but because it is so simple to use, developers have bought into the system and are actively helping to improve it as well.
Creating a Scalable Monitoring System That Everyone Will Love (Velocity Conf)Molly Struve
A year ago, my company's monitoring setup was a disaster! We had 6 different monitoring tools sending alerts all over the place. In this talk, I will share how we overhauled our entire monitoring system and created a single, centralized, easy to use system that fits all of our needs. Not only does it fit our needs, but because it is so simple to use, developers have bought into the system and are actively helping to improve it as well.
A year ago, my company's monitoring setup was a disaster! We had 6 different monitoring tools sending alerts all over the place. In this talk, I will share how we overhauled our entire monitoring system and created a single, centralized, easy to use system that fits all of our needs. Not only does it get the job done, but because it is so simple to use, developers have bought into the system and are actively helping to improve it as well.
Sometimes your fastest queries can cause the most problems. I will take you beyond the slow query optimization and instead zero in on the performance impacts surrounding the quantity of your datastore hits. Using real world examples dealing with datastores such as Elasticsearch, MySQL, and Redis, I will demonstrate how many fast queries can wreak just as much havoc as a few big slow ones. With each example I will make use of the simple tools available in Ruby to decrease and eliminate the need for these fast and seemingly innocuous datastore hits.
Sometimes your fastest queries can cause the most problems. I will take you beyond the slow query optimization and instead zero in on the performance impacts surrounding the quantity of your datastore hits. Using real world examples dealing with datastores such as Elasticsearch, MySQL, and Redis, I will demonstrate how many fast queries can wreak just as much havoc as a few big slow ones. With each example I will make use of the simple tools available in Ruby and Rails to decrease and eliminate the need for these fast and seemingly innocuous datastore hits.
Sometimes your fastest queries can cause the most problems. I'll take you beyond the slow query optimization and instead zero in on the performance impacts surrounding the quantity of your datastore hits.. Using real world examples dealing with datastores such as Elasticsearch, MySQL, and Redis, I will demonstrate how many fast queries can wreak just as much havoc as a few big slow ones. With each example I will make use of the simple tools available in Ruby to decrease and eliminate the need for these fast and seemingly innocuous datastore hits.
Cache is King: Get the Most Bang for Your Buck From RubyMolly Struve
Sometimes your fastest queries can cause the most problems. I will take you beyond the slow query optimization and instead zero in on the performance impacts surrounding the quantity of your datastore hits. Using real world examples dealing with datastores such as Elasticsearch, MySQL, and Redis, I will demonstrate how many fast queries can wreak just as much havoc as a few big slow ones. With each example I will make use of the simple tools available in Ruby to decrease and eliminate the need for these fast and seemingly innocuous datastore hits.
Everyone wants their Elasticsearch cluster to index and search faster, but optimizing both and finding the balance between the two can be tricky. At Kenna Security, we use Elasticsearch to store over 3 billion vulnerabilities for our clients. All that data needs to be quickly accessible so clients can assess their cyber security risk. At the same time the data is constantly changing. On average, we update 200+ million documents a day which means indexing speed is also a top priority.
In the early days our cluster could barely keep up. Nodes would fall over constantly, indexing queues would get backed up for days, and searches timed out about 50% of the time. Fixing all of these issues did not happen overnight. However, with a lot of testing, tweaking, and a few “OH crap!” moments we were able to build a stable, 21 node cluster that now meets all of our indexing and searching demands. In this talk I will share the insights we gained and the strategies we used to scale our cluster and hopefully that advice will save others some time and frustration as they grow their own.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
LeadDev NYC 2022: Calling Out a Terrible On-call System
1. Calling Out a Terrible
On-Call System
April 6, 2022
Hi! My name is Molly Struve and I want to welcome you to calling out a terrible on-call system! Currently I am a Site Reliability Engineer at Netflix but the story I want to share with you today is from my time at a previous
start up company. Being a…
2. Site Reliability
Engineer
@molly_struve
Site Reliability Engineer means I am one of those weird people that thrives on being on-call. The adrenalin rush of having to figure out a bug as quick as possible really gets me going. But, I’m pretty positive the vast
majority of engineers are not like myself. Raise your hand if you…
3. @molly_struve
hate being on-call or in the past have had a horrible on-call experience? Ah yes, MANY of you! On-call is a necessity to support the applications we build but…
4. @molly_struve
it SHOULD NOT, I repeat, it should NOT make people miserable. If your engineers are miserable during on-call then you have a problem. I am here today to give you some suggestions and strategies you can use to
help you fix this common problem. All of these strategies….
5. @molly_struve
I am about to share unfortunately didn’t just hit me while I was sleeping one night. To figure all of this out I had to live through one of those terrible on-call systems and that experience showed me first hand the toll a
broken system can take on all of those involved. Here is the story of a terrible on-call system in the making!
8. @molly_struve
👩💻
👩💻
👨💻
👩💻
👨💻
1 week
one week at a time. When we first started the rotation the team had 5 developers on it and it worked great! Everyone was very experienced with the application and with being on call bc everyone did it relatively often
However, as the years went by….
13. @molly_struve
😱
☹
😭
😣
😡
😱
😡
😞
😖
😭
😬
😬
☹
Growing, complex
codebase
😣
had grown tremendously and was vastly more complex than when we had started. There were so many things being developed at once that when a problem arose there was a solid chance the on-call developer knew
nothing about it or the code that was causing it. And what happens when an alarm goes off and you have no idea what to do…
14. Presentation
Title
@molly_struve
You panic! And who can blame you? We have all been there. When you have to fix something you know nothing about its terrifying! When the developers would panic, they would turn to the people they knew could likely
fix it the fastest and that was…
15. @molly_struve
Site Reliability
Engineering Team
The Site Reliability Engineering team. Of course the devs were right in their assumption, usually the Site Reliability team could fix the problem the fastest, but the Site Reliability team only had 3 people on it and
relying on a small set of people for everything doesn’t scale. Constantly having to jump in and help….
16. @molly_struve
Site Reliability
Engineering Team
with on-call issues quickly began to drain a lot of the team's time and resources. Essentially, the SRE team began to act as if they were on-call 24/7. The constant bombardment of questions and requests…
19. @molly_struve
😱
☹
😭
😣
😡
😱
😡
😞
😖
😭
😬
😬
☹ 😣
No
Ownership
no ownership over the code they were responsible for while on-call. One person would write code and another person would be the one debugging it if it broke. The app was so big that there was no way anyone
could have a sense of ownership over the production code since there was just too much of it. This…
27. @molly_struve
Team
Organization
How the engineering team was organized at the time so you have some context about how we ended up with the solution we did. In the engineering department at the time there were…
30. @molly_struve
One Monolithic
Application
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💼 👩💼
👨💼
single monolithic application. Unlike other apps that might have very separate backend components owned by individual teams, there were no clear or obvious lines of ownership within…
31. @molly_struve
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💼 👩💼
👨💼
One Monolithic
Application
this single monolithic application. This would prove to be the biggest hurdle when it came to fixing this terrible on-call system. Now that you have a little background on the team organization, lets get to the good
stuff…
36. @molly_struve
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💼 👩💼
👨💼
More Frequent
Shifts
to more frequent on-call shits which meant more practice and experience for those handling on-call. As backward as it may sound, being on-call on a regular cadence is a benefit because developers become…
37. @molly_struve
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💻 👨💻
👩💻
👩💻
👨💻
👨💼 👩💼
👨💼
More Frequent
Shifts
More Comfortable
a lot more comfortable with it and are able to really figure out a strategy that works best for them. So the first strategy we implemented…
39. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
to split our giant rotation into 3 smaller on-call rotations. Those 3 smaller on-call rotations solved the problem of shift frequency but that still left the biggest problem of all…
42. @molly_struve
Application
Ownership
Team 1 Team 2
Team 3
to split up the on-call application ownership amongst the 3 developer teams. Even though I am about to breeze through this split I want to be clear, this did not happen overnight. During this process there were a lot of
meetings and planning and collaborating …
43. @molly_struve
Team 1 Team 2
Team 3
Site
Reliability
Between the Site Reliability team and the developer teams to figure out the best and most logical way to split up the components of our monolithic application. I really want to highlight that this was not the Site
Reliability team calling the shots and handing over the “assignments” to the…
44. @molly_struve
Team 1 Team 2
Team 3
Site
Reliability
developer teams. We wanted this whole process to be as collaborative as possible bc we knew that was going to give us the highest chance of succeeding. These application components may be specific for this
45. @molly_struve
Splitting App Ownership
Team 1 Team 2 Team 3
situation but I want to call them out in hopes that it might spark some ideas for how you could split up application ownership amongst multiple teams when clear lines might be hard to define. We first started by splitting up
our…
46. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Splitting App Ownership
Background workers. Our app did a lot of async processing and had a lot of background works so we figured those would be good to divide up. Team 1
47. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Data Processing
Workers
Splitting App Ownership
got the Data processing workers. Team 2…
48. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Data Processing
Workers
Overnight
Reporting Workers
Splitting App Ownership
Got the Overnight reporting workers and finally Team 3…
49. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Splitting App Ownership
Got the User communication workers. The next thing we needed to split up were our…
50. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Service Alerts
Splitting App Ownership
Service Alerts. When I say service alerts here I am referring to alerts that were set up within our current monitoring system to monitor things like our database and systems. Before it was a single person staying on top of all
of them. With this new system we decided to split them up as well. We gave…
51. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Service Alerts
Redis and Worker
Queue Alerts
Splitting App Ownership
Team 1 the Redis and Worker queue alerts. We gave…
52. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Service Alerts
Redis and Worker
Queue Alerts
Elasticsearch and
API Alerts
Splitting App Ownership
Team 2 the Elasticsearch and API Alerts. And finally we gave…
53. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Service Alerts
Elasticsearch and
API Alerts
Redis and Worker
Queue Alerts
MySQL and Page
Load Alerts
Splitting App Ownership
Team 3 the MySQL and Page load alerts. Now that existing service alerts and our background workers were split up, the last thing to split up was the
54. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Service Alerts
Application
Code
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Redis and Worker
Queue Alerts
Elasticsearch and
API Alerts
MySQL and Page
Load Alerts
Splitting App Ownership
Application components/Code. We were running a single monolithic Rails application so this involved splitting up things like Models and Controllers within the codebase. We started by giving…
55. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Service Alerts
Application
Code
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Redis and Worker
Queue Alerts
Elasticsearch and
API Alerts
MySQL and Page
Load Alerts
Data Processing
Code
Splitting App Ownership
All the Data processing Code to Team 1. We figured this would pair well with the background workers they were also assigned. We gave…
56. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Service Alerts
Application
Code
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Redis and Worker
Queue Alerts
Elasticsearch and
API Alerts
MySQL and Page
Load Alerts
Data Processing
Code
Reporting and
Emailing Code
Splitting App Ownership
Team 2 the emailing and reporting code which paired well with their overnight workers. And finally we gave..
57. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Service Alerts
Application
Code
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Redis and Worker
Queue Alerts
Elasticsearch and
API Alerts
MySQL and Page
Load Alerts
User and App Alert
Code
Data Processing
Code
Reporting and
Emailing Code
Splitting App Ownership
Team 3 the User and in App Alerting Code which paired well with their user communication workers. Once..
58. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Service Alerts
Application
Code
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Redis and Worker
Queue Alerts
Elasticsearch and
API Alerts
MySQL and Page
Load Alerts
User and App Alert
Code
Data Processing
Code
Reporting and
Emailing Code
Splitting App Ownership
the lines had been drawn, we stressed to each of the developer teams that despite doing our best to balance the code equally we might still have to move things around. This showed the developers that we were fully
invested in making sure this new on-call rotation was fair and better for everyone. As I mentioned earlier, I wanted…
59. @molly_struve
Team 1 Team 2 Team 3
Background
Workers
Service Alerts
Application
Code
Data Processing
Workers
Overnight
Reporting Workers
User Communication
Workers
Redis and Worker
Queue Alerts
Elasticsearch and
API Alerts
MySQL and Page
Load Alerts
User and App Alert
Code
Data Processing
Code
Reporting and
Emailing Code
Splitting App Ownership
To get a little specific here with how we split up our application so that hopefully it can give you some ideas about how you might go about splitting up ownership in a single application where lines might not be
clearly drawn. And with that…
60. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Splitting up the application ownership slides into spot 2 in our overhauling on-call list. Now when it comes to instilling a feeling of ownership another big obstacle is constantly…
61. @molly_struve
Changing Code
Changing code. Having 15 developers meant we could turn out a lot of features but then the question became, how did teams stay on top of the code they were responsible for when on-call and it changed. For this
we……
68. @molly_struve
/*.md @org/team-1
/app/controllers/reporting/ @org/team-2
/app/workers/data_processing/ @org/team-1
/config/database.yml @org/team-3
CODEOWNERS
this file in place, when any file in your app directory is updated in a Pull Request, the owner of the file will automatically be tagged for review. This allowed the 3 teams to work across the entire codebase while also
staying on top of what was changing in the components they were responsible for during On-Call. Using…
69. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Use a CODEOWNERS file
A CODEOWNERS file slips into the 3rd spot in our overhauling on-call strategy list. With the application components split up and a CODEOWNERS file to support and empower that ownership feeling, next…
70. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Use a CODEOWNERS file
on our list was to make sure every team and every single person on each team was completely comfortable with the application components they had been given ownership over. To do this the SRE team…
75. @molly_struve
On-Call Training Sessions
• Common issues
common issues that might pop up. For example, when this alert goes off usually it means xyz is broken with this piece of the code. We also took the time to…
76. @molly_struve
On-Call Training Sessions
• Common issues
• Code Functionality
Dive into all of the code functionality. We made sure every person on every team knew exactly what each piece of code they covered did. And last but not least, we made sure each team understood…
77. @molly_struve
On-Call Training Sessions
• Common issues
• Code Functionality
• Larger Application Impact
How their components impacted the rest of the application. For example, if say Redis went down, how did that affect the rest of the application. These On-Call training sessions gave devs…
78. @molly_struve
Confidence
a lot more confidence in their ability to handle on-call situations because they now had a clear picture of what they were responsible for and how to handle it. Even though they hadn’t built some of the code themselves,
they had an understanding of exactly how it all worked. Hosting…
79. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Use a CODEOWNERS file
On-Call Training Sessions
On-call training sessions takes the 4th spot in our overhauling on-call list. As I mentioned earlier, the purpose of these training sessions was to not only educate the developers about the code they were supporting,
but also to give them confidence. Another confidence booster for developers who were on-call was…
80. @molly_struve
On-Call Support
Having on-call support. What exactly do I mean by this? When a person is paged they aren’t always going to have all of the answers. Sometimes they need help and support from someone else to figure out the
problem. Originally…
81. @molly_struve
On-Call Support
Site Reliability
the Site Reliability team acted as support for the on-call developer. If the on-call developer had questions or needed help they would talk to the Site Reliability team member that was on-call that week. The problem
with this approach was that our Site Reliability team, as I mentioned earlier, only…
83. @molly_struve
On-Call Support
😥
😕 😫
It’s not going to end well! Our Site Reliability team got burned out pretty quick being the constant support system for the on-call developers. With…
86. @molly_struve
👨💻 👨💻
👩💻
👩💻
👨💻
👩💼
👨💻 👨💻
👩💻
👩💻
👨💻
👨💼
👨💻 👨💻
👩💻
👩💻
👨💼 👨💻
acts as support for the others. If any developer finds themselves overwhelmed or stuck on an issue they have two people they can reach out to for help. Having a support system like this is crucial for crafting an on-
call system that is comfortable for everyone. No one wants to feel alone when they are on-call so ensuring that the…
87. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Use a CODEOWNERS file
On-Call Training Sessions
On-Call Support System
On-Call developer has a solid support system in place is crucial. The last improvement we made to our system that was welcomed by everyone was that we…
90. @molly_struve
😦 Technically debugging and
fixing the problem
Technically debugging and fixing the problem which we know is a pretty big as in itself. In addition, they were responsible…
91. @molly_struve
😦 Technically debugging and
fixing the problem
Setting a status page if
needed
For Setting a status page if needed. And last it was their job to handle…
92. @molly_struve
😦 Technically debugging and
fixing the problem
Communicating the problem
to the rest of the team
Setting a status page if
needed
communicating the problem to the rest of the team. Needless to say the duties of the on-call developer were WAY overloaded. With the new system…
93. @molly_struve
😦 Technically debugging and
fixing the problem
Communicating the problem
to the rest of the team
Setting a status page if
needed
the ONLY responsibility an on-call developer had was debugging and fixing the problem. Narrowing the scope was crucial to…
94. @molly_struve
😊 Technically debugging and
fixing the problem
Communicating the problem
to the rest of the team
Setting a status page if
needed
improving the on-call experience. It allowed the developers to focus on what they did best, fixing the technical problem at hand. The responsibility of setting the status page was moved…
95. @molly_struve
😊 Technically debugging and
fixing the problem
Setting a status page if we
need it
Communicating the problem
to the rest of the team
to the support team. This made sense to us bc the support team is the closest to the customer, and therefore, are the best equipped to communicate any problems. When an incident occurred, the support team was
notified and was responsible for determining if a status page or any customer communication was needed. The responsibility of…
96. @molly_struve
😊 Technically debugging and
fixing the problem
Setting a status page if we
need it
Communicating the problem
to the rest of the team
Communicating the problem internally was then moved to the manager of the on-call developer’s team. If updates need to be spread across the tech organization during an incident the on-call developer’s manager
was responsible for doing it. Narrowing the scope of the on-call responsibilities was…
97. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Use a CODEOWNERS file
On-Call Training Sessions
On-Call Support System
Narrow On-Call Responsibility Scope
The last piece of the puzzle when it came to overhauling this terrible on-call system. I am sure many of you are thinking, “That sounds great, but what does an on-call system like that get me? How can my team
benefit from implementing some of these strategies?!” I have touched lightly on some of the benefits but now I want to really dive into them and talk about…
99. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Improved alerting. Originally the Site Reliability team had set up all the alerting tools. However, once we split up the alerts and handed them over to each of the 3 developer teams, the teams took them and ran.
Because each team felt a renewed sense of ownership over their alerts they started to improve and build on them. Not only did they make more alerts, …
100. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
but they improved the accuracy of the existing ones. The improved alerts in turn led to happier on-call developers bc there were less false positives and alerts were tweaked to alert on problems sooner before
they became a bigger issue. Improved alerting Wasn’t the only payoff we saw after overhauling the system. As I briefly mentioned, there was…
101. @molly_struve
Sense of
Ownership
A renewed sense of ownership among all of the developers. Even though one team would edit the code that another team supported, there was still a keen sense of ownership for the supporting team. The supporting
team acted as the domain experts over the code they owned when on-call. The key strategy for ensuring this sense of ownership was…
102. @molly_struve
CODEOWNERS
using the CODEOWNERS file. The CODEOWNERS file ensured that the supporting team was always aware and could sign off on any changes made to the code they supported. In addition, splitting up the code
between the 3 teams meant each team had…
103. @molly_struve
Manageable
Code Chunks
A manageable chunk of code that they could actually learn and support. Unlike before where every developer had to support the entire codebase which was way too much for any single person to handle. Shrinking…
105. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Sense of ownership again, and that sense of ownership made them excited to support their on-call code. Another benefit of the new system was…
106. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Faster incident response time. Hallelujah! Incident response time improved for a couple of reasons. For one, with 3 developers on-call at once and each one of them focusing on a smaller piece of the application, they
could…
108. @molly_struve
Identify Problems
Faster
major issues. This decreased the incident response times and even helped prevent some incidents altogether. In addition, to identifying problems sooner, debugging and figuring out what had triggered a problem…
109. @molly_struve
Identify Problems
Faster
became quicker because teams were intimately familiar with their alerts and the pieces of code they owned. When a problem arose the team could debug it much more efficiently than before. Faster…
110. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
incident response is always the goal of any Site Reliability team and to be able to achieve it with a new on-call system was pretty awesome. Another payoff of this new system was that the person who was on-call
was…
111. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Never alone. Having 3 developers on-call at once means that none of the developers are ever alone when they are on-call. If things started to fall apart in one section of the application, the developer that owned that
section knew there were two others available to…
112. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
help if they needed it. Being On-call can be stressful, but knowing that there is always someone easily accessible to help can do wonders for a developer’s confidence. Ensuring that no one is ever alone may
seem like a small positive, but I want to add..
113. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
this was the most requested attribute of an on-call system from the developers. Before starting this overhaul process, I spoke with a few developers to get a feel for what they wanted out of the new system and at…
114. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
the very top of the list was having help and support when on-call. Don’t underestimate how much a multiple developer on-call system can improve the on-call experience. Developers…
115. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Never being alone while on call takes the 4th spot in our list of benefits of overhauling our on-call system. The last benefit we discovered with the new system was…
116. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
better cross-team communication. As I stated before, each of the 3 developer teams worked across the entire application. This meant teams were often changing the code that another team was responsible for
during on-call. Having the CODEOWNERS file ensured that the on-call team was alerted to those changes. This..
117. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
not only allowed for a good technical review but it also kept each of the teams up to date on what the other teams were working on and doing. And with that…
118. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
The list of payoffs of overhauling this terrible on-call system is complete. At the top…
119. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
Improved alerting. Any small Site Reliability team knows that any outside help you can get with your alerting and monitoring systems is hugely appreciated and benefits everyone. Developers got a renewed…
120. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
sense of ownership making them more enthusiastic about their on-call responsibilities…
121. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
Incident response got faster thanks to the improved alerting and in depth knowledge of each team over their on-call components. On-call developers were
122. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
never alone when they were on call giving them peace of mind and confidence. And finally…
123. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
cross-team communication improved which benefited the entire technical organization. I think we can agree that these..
124. NETFLIX
SECTION DIVIDER
The Payoff
1
2
3
4
5
Improved Alerting
Sense of Ownership
Faster Incident Response
Never Alone
Betting Cross-Team Communication
Benefits would benefit all of our teams and to get them from an on-call system is huge. If these are the benefits that your team is looking for and developers are struggling within your on-call system then consider
overhauling it…
125. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Use a CODEOWNERS file
On-Call Training Sessions
On-Call Support System
Narrow On-Call Responsibility Scope
With the help of these 6 strategies. Smaller rotations to increase on-call frequency. Split up application ownership so developers can once again feel like they own what they are supporting. Use…
126. NETFLIX
SECTION DIVIDER
Overhauling On-Call
1
2
3
4
5
6
Smaller On-Call Rotations
Split Up Application Ownership
Use a CODEOWNERS file
On-Call Training Sessions
On-Call Support System
Narrow On-Call Responsibility Scope
a CODEOWNERS file to further instill that sense of ownership for your developers. Host on-call training sessions for your teams and ensure those who are on-call always have a support system. Finally keep the
responsibilities for your on-call developers in check so they can focus on what they do best, fixing technical problems. On-call……
127. @molly_struve
On-Call Shouldn’t
Suck
is something that many people in this industry dread and it shouldn't be that way. If people are dreading on-call then something is broken with your system. Sure, everyone at some point will get that late night or
weekend page that is a pain, but that pain…
128. @molly_struve
On-Call Shouldn’t
Suck
shouldn't be the norm. If on-call makes people want to pull their hair out ALL the time, then you have a problem that needs to be fixed. I hope this talk has given you some ideas to help you improve your own on-call
system so that it can help your developers thrive.