SlideShare a Scribd company logo
Scaling Twitter To Go After
the Fail Whale
Jonathan Reichhold - Twitter Engineering
Early Twitter....
2010 World Cup Challenge
• Tweet and user requests growing
exponentially (good problem)
Load....
Monolithic Architecture
• Ruby on Rails
• Temporally-sharded MySQL
• Memcached
• ~60 engineers
Stabilize & Understand
• Learn & make improvements
• Don’t just survive
Be Realistic & Ambitious
• Prioritize what can be fixed and timeframes
for doing it
• Sometimes need the duct tape
• Find patterns and improvements for the
long term
A Bad
Approach
• Flip
switches/branches/other
until fixed
http://www.flickr.com/photos/chrism70/1144424032
Science
Step 1:Trustworty Data
• https://blog.twitter.com/2013/observability-at-twitter
Step 2: Set Expectations
• Being on-call is a job and during high stress
will burn folks out
• Maintain calm and order
Post Mortems
• Improvement becomes part of process
• Stress makes system stronger not weaker
Teamwork
• All of this made possible by amazing team
and management
• Culture
Capacity Planning &
Forecast
• Just in time but realistic
• Figure out real buffers
Longer Term Changes
• Architecture changes take time and
changes in organization
Improve Efficiency
• Rails/Ruby -> Scala & JVM
• 200-300 RPS -> 10,000-20,000
• Single process per request -> Finagle
Service Orientation
• Make changes at
interface
boundary, not in
single monolith
• Team interactions
simplified
• Core nouns and
verbs
Move out of public
cloud
• Flexibility and latency demand at some
point
• Hard problem
• Datacenter as failure domain
• Mesos
Dynamic Configuration
• Update routes and compare live vs
dark/new
• Quickly adjust to issues
• Faster and less fragile deploys
Improve storage
• Gizzard for MySQL
• Improve Memcached
• Storage as a service
• Snowflake IDs
Development Speed
• Startups live and die by development speed
• Make easier to ship but contain damage
Conclusion
• Fail whale is now an endangered species
• Went from event driven spikes to pushing
continuous reliability improvements where
events became trivial
Tweet Spikes Today
• New Tweets per second (TPS) record: 143,199 TPS.
Typical day: more than 500 million Tweets sent; average
5,700 TPS. (August 2 at 7:21:50 PDT;August 3 at
11:21:50 JST)
• https://blog.twitter.com/2013/new-tweets-per-second-record-and-how
Final Thoughts
• Marathon not a sprint. Maintain systems
and yourself
• We are hiring to make system even better
Endangered: Fail Whale
Jonathan
Reichhold
@jreichhold
Questions?
• https://blog.twitter.com/2013/new-tweets-per-seco
• https://blog.twitter.com/2013/observability-
at-twitter

More Related Content

What's hot

Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day Toronto
Jon Haddad
 
CQRS
CQRSCQRS
11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL Developers11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL Developers
Ike Ellis
 
MySQL 5.7 - the first few months
MySQL 5.7 - the first few monthsMySQL 5.7 - the first few months
MySQL 5.7 - the first few months
Simon J Mudd
 
MySQL Failover and Orchestrator
MySQL Failover and OrchestratorMySQL Failover and Orchestrator
MySQL Failover and Orchestrator
Simon J Mudd
 
Sustaining Your Career
Sustaining Your CareerSustaining Your Career
Sustaining Your Career
Scott Lowe
 
Experiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for youExperiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for you
Simon J Mudd
 
Containerization: The DevOps Revolution
Containerization: The DevOps Revolution Containerization: The DevOps Revolution
Containerization: The DevOps Revolution
SoftServe
 
Kanban @ nine.ch
Kanban @ nine.chKanban @ nine.ch
Kanban @ nine.ch
Matías E. Fernández
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser colleenfry
 
DevOps in the Real World
DevOps in the Real WorldDevOps in the Real World
DevOps in the Real World
Max Yermakhanov
 
Tech writing in a continuous deployment environment
Tech writing in a continuous deployment environmentTech writing in a continuous deployment environment
Tech writing in a continuous deployment environment
Christine Burwinkle
 
SPA vs. MPA
SPA vs. MPASPA vs. MPA
SPA vs. MPA
Mehmet Ali Tastan
 
New Server in an Hour
New Server in an HourNew Server in an Hour
New Server in an Hour
Mike Hillwig
 
In Memory Cahce Structure
In Memory Cahce StructureIn Memory Cahce Structure
In Memory Cahce Structure
Mehmet Ali Tastan
 
MySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the WireMySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the Wire
Simon J Mudd
 
You've Launched! Now What?
You've Launched! Now What?You've Launched! Now What?
You've Launched! Now What?
Amye Scavarda
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
Jon Haddad
 

What's hot (18)

Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day Toronto
 
CQRS
CQRSCQRS
CQRS
 
11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL Developers11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL Developers
 
MySQL 5.7 - the first few months
MySQL 5.7 - the first few monthsMySQL 5.7 - the first few months
MySQL 5.7 - the first few months
 
MySQL Failover and Orchestrator
MySQL Failover and OrchestratorMySQL Failover and Orchestrator
MySQL Failover and Orchestrator
 
Sustaining Your Career
Sustaining Your CareerSustaining Your Career
Sustaining Your Career
 
Experiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for youExperiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for you
 
Containerization: The DevOps Revolution
Containerization: The DevOps Revolution Containerization: The DevOps Revolution
Containerization: The DevOps Revolution
 
Kanban @ nine.ch
Kanban @ nine.chKanban @ nine.ch
Kanban @ nine.ch
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser
 
DevOps in the Real World
DevOps in the Real WorldDevOps in the Real World
DevOps in the Real World
 
Tech writing in a continuous deployment environment
Tech writing in a continuous deployment environmentTech writing in a continuous deployment environment
Tech writing in a continuous deployment environment
 
SPA vs. MPA
SPA vs. MPASPA vs. MPA
SPA vs. MPA
 
New Server in an Hour
New Server in an HourNew Server in an Hour
New Server in an Hour
 
In Memory Cahce Structure
In Memory Cahce StructureIn Memory Cahce Structure
In Memory Cahce Structure
 
MySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the WireMySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the Wire
 
You've Launched! Now What?
You've Launched! Now What?You've Launched! Now What?
You've Launched! Now What?
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 

Similar to #Surgeconf Scaling Twitter to go After the Fail Whale

DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
Matt Ray
 
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum PainKafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
confluent
 
Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1
Ortus Solutions, Corp
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
confluent
 
What we learned at pass summit in 2018
What we learned at pass summit in 2018What we learned at pass summit in 2018
What we learned at pass summit in 2018
Red Gate Software
 
Project Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on DockerProject Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on Docker
RightScale
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
John Adams
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
Amit Kejriwal
 
Dibi Conference 2012
Dibi Conference 2012Dibi Conference 2012
Dibi Conference 2012
Scott Rutherford
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
Achievers Tech
 
E2 evc 3-2-1-rule - mikeresseler
E2 evc   3-2-1-rule - mikeresselerE2 evc   3-2-1-rule - mikeresseler
E2 evc 3-2-1-rule - mikeresseler
Mike Resseler
 
Industrial Data Science
Industrial Data ScienceIndustrial Data Science
Industrial Data Science
Niko Vuokko
 
Stored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s GuideStored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s Guide
VoltDB
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
DATAVERSITY
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
Gibraltar Software
 
Who wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and ResponsibilitiesWho wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and Responsibilities
Kevin Kline
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like system
Aree Oh
 

Similar to #Surgeconf Scaling Twitter to go After the Fail Whale (20)

DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum PainKafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
 
Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
 
What we learned at pass summit in 2018
What we learned at pass summit in 2018What we learned at pass summit in 2018
What we learned at pass summit in 2018
 
Project Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on DockerProject Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on Docker
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Gaurav_CV
Gaurav_CVGaurav_CV
Gaurav_CV
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Dibi Conference 2012
Dibi Conference 2012Dibi Conference 2012
Dibi Conference 2012
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
E2 evc 3-2-1-rule - mikeresseler
E2 evc   3-2-1-rule - mikeresselerE2 evc   3-2-1-rule - mikeresseler
E2 evc 3-2-1-rule - mikeresseler
 
Industrial Data Science
Industrial Data ScienceIndustrial Data Science
Industrial Data Science
 
Stored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s GuideStored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s Guide
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
 
Who wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and ResponsibilitiesWho wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and Responsibilities
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like system
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 

#Surgeconf Scaling Twitter to go After the Fail Whale

Editor's Notes

  1. Introduce self.....
  2. Job of startups is to take risk and push quickly but learn from it. Twitter's genesis valued quick design and feature iteration over stability. We were known for the fail whale Events dominating Twitter traffic Limited engineers. Slow to ramp up with traffic (impedence mismatch)
  3. Limited resources (people, machines, and algorithms) Whale was a meme and what we were known for Rapid iteration on features and infrastructure 2010 FIFA World Cup forced an understanding of reliability as a feature on the organization. While we didn't completely break we suffered lots.
  4. Allowed us to iterate fast. Became a bottleneck
  5. Harden the code and algorithms for scale and nature of distributed system Changing the structure and systems to external stresses made the system more resilient to later challenges.  Fail whale extinct
  6. Most people start with pattern they know Nothing is learned in general
  7. Instead propose a hypothesis and test Did the system change as expected? Learn what works (and doesn’t) and use understanding for next iteration
  8. Be rigorous on recording data and validating Lots of time can be wasted on invalid data Signal vs noise False Alerts
  9. Being on-call is a job and during high stress will burn folks out Make sure you have good communications (phone numbers, IRC/Campfire, IM, Chat, etc) that work for group Make sure folks are kept informed regularly and can focus on problems. Both above and below you Maintain calm and order