Fire fighting a-developers-story

•Download as PPTX, PDF•

0 likes•135 views

Anything that can go wrong will go wrong! That’s how Murphy’s law states that the fact that outages are inevitable and systems often misbehave. As developers, we are working hard to build reliable and scalable systems, and it’s our job to keep the ship floating and the services up. In this session, we will talk about fires, how to put them out, and how to be ready for them. We will discuss abuser stories, degradation of service, and dependencies management as possible techniques to fight fires. We will discuss these techniques through some war stories and how they helped or could have helped service owners. Agenda: - What brought me here? - What is fire? - Firefighting Vs. The dev team - How do you end up with fire? - Your dependencies will fail. - Feature toggles are your friends - Abuser stories

Software

Agenda
• What to expect?
• What brought me here?
• What is a fire?
• Fire-fighting vs. The team
• Mapping out dependencies
• Feature toggles
• Abuser Stories
• The unknown unknowns!

What to
expect?
Why fires aren't only harming
your SLIs?
A pro-active approach to fire-
fighting.
What could they have done
better? (War stories)

What brought
me here?
• Same process, different problem
• The scale matters
• Less fire-fighting = Higher productivity

What is a fire?
Anything that would require an emergency re-
allocation of resources.
IOW, anything that would require you to drop
whatever you have in hands right now and start
working on.

Why is it a
problem?
Disturb the planned work
Hero culture
Stressful for fighters
Patches make it chronic

How does it
happen?
• Cutting corners while solving problems
• Too many problems and no enough time
• Allowing others control the project agenda
• No priorities, everything is urgent

Dependencies
Think about your subsystems
and dependencies
Document impact in case of
failure
Think of fallbacks or degradation
of service

login list_products checkout
Users DB Fatal N/A Needed
User activity stream Needed Needed Needed
Products DB N/A Fatal Fatal
Stored Cards N/A N/A Needed

Features
• What are they?
• Release Toggles vs Long-lived Switches
• Document them:
• When to use?
• What is the impact?

Abuser
Stories
• Evil users
• It's important to change hats
• Prepare for the worst

War stories?
• As an abuser, I'd like to live stream violent
content(video) and make it go viral. - Facebook
• As an abuser, I want to post social engineering-
based scams and make them go viral. - Twitter

There are known knowns. There
are things we know we know. We
also know there are known
unknowns. That is to say, we
know there are some things we
do not know. But there are also
unknown unknowns, the ones we
don't know we don't know.
Donald
Rumsfeld

Knowns Unknowns
Known Known
Knowns
(Facts)
Known Unknowns
(Questions)
Unknown Unknown
Knowns
(Intuition)
Unknown
unknowns
(Exploration)

The unknown
unknowns
You don't know what you don't
know
Users use systems in
unexpected ways
Log messages & metrics are
your best friends

* https://twitter.com/DZoneInc/status/1301603469267214338

Procedure
ASSESS CONTAIN INVESTIGATE
RESOLVE DOCUMENT

Similar to Fire fighting a-developers-story

Tips & Tricks for Being a Successful Tech LeadBen Limmer

Stop punching yourself in the face!Hannes Lowette

How to Talk About Your Open Source Project So People Get ItAll Things Open

What I have learned by dealing with a dungeon masterRaúl Araya Tauler

Project Management 101 - Wordcamp TO 05112011Liesl Barrell

Corp Web Risks and ConcernsPINT Inc

Kevinjohn Gallagher's: Emperors new clothes (WordUp Glasgow 2012)kevinjohngallagher

Emperors new clothes_digitalbarn_output_snakkkevinjohngallagher

Emperors new clothes - digitalbarn2012kevinjohngallagher

Paired with an Idiot: Things that sabotage successDevin Olson

Rich Holdsworth @Didlr PresentationLee Stott

Scrum: From the Classroom to the Workplace :: IPLeiria 2016Pedro Gustavo Torres

Emperors new clothes_jabkevinjohngallagher

Large Scale Data ManagementThomas Miller

Proyectos Investigación y DesarrolloJuan Manuel Gonzalez Calleros

Software Craftsmanship and Agile Code GamesMike Clement

Perspectives on salesforce architecture Forcelandia talk 2017Steven Herod

Velocity Conference NYC 2014 - Real World DevOpsRodrigo Campos

AgileCamp 2014 Track 5: The Seven Wastes - Can You Get LeanerHyperdrive Agile Leadership (powered by Bratton & Company)

Practical agile TechExeterIan Ames

Similar to Fire fighting a-developers-story (20)

Tips & Tricks for Being a Successful Tech Lead

Stop punching yourself in the face!

How to Talk About Your Open Source Project So People Get It

What I have learned by dealing with a dungeon master

Project Management 101 - Wordcamp TO 05112011

Corp Web Risks and Concerns

Kevinjohn Gallagher's: Emperors new clothes (WordUp Glasgow 2012)

Emperors new clothes_digitalbarn_output_snakk

Emperors new clothes - digitalbarn2012

Paired with an Idiot: Things that sabotage success

Rich Holdsworth @Didlr Presentation

Scrum: From the Classroom to the Workplace :: IPLeiria 2016

Emperors new clothes_jab

Large Scale Data Management

Proyectos Investigación y Desarrollo

Software Craftsmanship and Agile Code Games

Perspectives on salesforce architecture Forcelandia talk 2017

Velocity Conference NYC 2014 - Real World DevOps

AgileCamp 2014 Track 5: The Seven Wastes - Can You Get Leaner

Practical agile TechExeter

Recently uploaded

Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.

Exploring iOS App Development: Simplifying the ProcessEvangelist Apps https://twitter.com/EvangelistSW/

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions

HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda

How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.

Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq

Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveCall Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

TECUNIQUE: Success Stories: IT Service providermohitmore19

A Secure and Reliable Document Management System is Essential.docxComplianceQuest1

Clustering techniques data mining book ....ShaimaaMohamedGalal

Software Quality Assurance Interview QuestionsArshad QA

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

Recently uploaded (20)

Unlocking the Future of AI Agents with Large Language Models

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf

SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...

Exploring iOS App Development: Simplifying the Process

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...

HR Software Buyers Guide in 2024 - HRSoftware.com

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...

How To Troubleshoot Collaboration Apps for the Modern Connected Worker

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...

Salesforce Certified Field Service Consultant

Hand gesture recognition PROJECT PPT.pptx

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

TECUNIQUE: Success Stories: IT Service provider

A Secure and Reliable Document Management System is Essential.docx

Clustering techniques data mining book ....

Software Quality Assurance Interview Questions

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Fire fighting a-developers-story

1. Fire Fighting A developer's story

2. Ahmed Eid Software Developer Booking.com, Amsterdam crouchsays

3. Agenda • What to expect? • What brought me here? • What is a fire? • Fire-fighting vs. The team • Mapping out dependencies • Feature toggles • Abuser Stories • The unknown unknowns!

4. What to expect? Why fires aren't only harming your SLIs? A pro-active approach to firefighting. What could they have done better? (War stories)

5. What brought me here? • Same process, different problem • The scale matters • Less fire-fighting = Higher productivity

6. What? Why? How?

7. What is a fire? Anything that would require an emergency re- allocation of resources. IOW, anything that would require you to drop whatever you have in hands right now and start working on.

8. Why is it a problem? Disturb the planned work Hero culture Stressful for fighters Patches make it chronic

9. How does it happen? • Cutting corners while solving problems • Too many problems and no enough time • Allowing others control the project agenda • No priorities, everything is urgent

10. How to be ready?

11. Dependencies Think about your subsystems and dependencies Document impact in case of failure Think of fallbacks or degradation of service

12. login list_products checkout Users DB Fatal N/A Needed User activity stream Needed Needed Needed Products DB N/A Fatal Fatal Stored Cards N/A N/A Needed

13. Features • What are they? • Release Toggles vs Long-lived Switches • Document them: • When to use? • What is the impact?

14. login list_products checkout Users DB Fatal N/A Needed User activity stream Needed Needed Needed Products DB N/A Fatal Fatal Stored Cards N/A N/A Needed

15. Abuser Stories • Evil users • It's important to change hats • Prepare for the worst

16. War stories? • As an abuser, I'd like to live stream violent content(video) and make it go viral. - Facebook • As an abuser, I want to post social engineering- based scams and make them go viral. - Twitter

17. There are known knowns. There are things we know we know. We also know there are known unknowns. That is to say, we know there are some things we do not know. But there are also unknown unknowns, the ones we don't know we don't know. Donald Rumsfeld

18. Knowns Unknowns Known Known Knowns (Facts) Known Unknowns (Questions) Unknown Unknown Knowns (Intuition) Unknown unknowns (Exploration)

19. The unknown unknowns You don't know what you don't know Users use systems in unexpected ways Log messages & metrics are your best friends

20. * https://twitter.com/DZoneInc/status/1301603469267214338

21. Procedure ASSESS CONTAIN INVESTIGATE RESOLVE DOCUMENT

22. Thanks

Editor's Notes

- Same agile process with the same mistakes, skipping retro, too many meetings, missing standup and even no proper estimation. - Fires now are caused by the live traffic and bad code rolled out unlike before for me where it was a new requirement from a potential customer - The new scale makes a fire actually feels like a fire, it's visualized, it's big and it's its own type of stress - I love and hate firefighting, it gives me the rush I need to be excited but if it becomes a fulltime job it's very exhausting, stressful and leads to a half-baked piece of software with tons of patches.
- Takes time from the sprint, pushes timelines .. - Tasks traffic, how many added vs how many we burn - Hero culture messes up the reward/compensation model, leads to stressed employees, no WLB, etc. - Rushed patches, leads to other bugs that will group with existing bugs and will become a fire within months and then it becomes chronic
Facebook iOS SDK tookdown Spotify, TikTok
Focus on user activity stream as a dependency. If it went down, we can work without the tiny green light, we can just ignore that code path of handle the failures gracefully
Refer to Martin Fowlers' toggles and the idea that if possible for feature releases we should break down the feature and make the toggle the last option compared to long term feature toggles as a degradation of service technique. Even if they use the same framework to flip the switch, we still need to make that logical distinction
Focus on stored card as a feature that you want to disable and enable manually as you might use it in case your suspect a malicious attack or a technical problem like an outage in your auth service.
Looking at the story from a different perspective really helps with fleshing out the task. It really helps if you think about it before hand, you already have the fire pressure.
It could have helped if they were ready for that kind of abusers. - Facebook could have shut down the video much faster. - Twitter could have been able to use some keywords to block a tweet from being viral (which what they did, but it took them some time).
Everything can happen at the same time, without proper monitoring you are fling blindly and have 0 visibility over your system. Think of a moment where you are seeing a drop in sales but you don't know why, after a while you realise that CS inbound is only from Europe and North Africa, after that you start looking into this and you find that the server has ran out of space because of logs and it wasn't written in a way to handle that gracefully.
- Timezones? - epochs? - MM-DD or DD-MM ?
- Don't forget to hydrate - It can be stressful and long, so try to stay calm - "The simplest explanation is most likely the right one" Occam's razor - Follow your gut and look for evidence

Fire fighting a-developers-story

Recommended

Recommended

More Related Content

Similar to Fire fighting a-developers-story

Similar to Fire fighting a-developers-story (20)

More from Agile Club

More from Agile Club (15)

Recently uploaded

Recently uploaded (20)

Fire fighting a-developers-story

Editor's Notes