SlideShare a Scribd company logo
1 of 26
Scaling Twitter To Go After
the Fail Whale
Jonathan Reichhold - Twitter Engineering
Early Twitter....
2010 World Cup Challenge
• Tweet and user requests growing
exponentially (good problem)
Load....
Monolithic Architecture
• Ruby on Rails
• Temporally-sharded MySQL
• Memcached
• ~60 engineers
Stabilize & Understand
• Learn & make improvements
• Don’t just survive
Be Realistic & Ambitious
• Prioritize what can be fixed and timeframes
for doing it
• Sometimes need the duct tape
• Find patterns and improvements for the
long term
A Bad
Approach
• Flip
switches/branches/other
until fixed
http://www.flickr.com/photos/chrism70/1144424032
Science
Step 1:Trustworty Data
• https://blog.twitter.com/2013/observability-at-twitter
Step 2: Set Expectations
• Being on-call is a job and during high stress
will burn folks out
• Maintain calm and order
Post Mortems
• Improvement becomes part of process
• Stress makes system stronger not weaker
Teamwork
• All of this made possible by amazing team
and management
• Culture
Capacity Planning &
Forecast
• Just in time but realistic
• Figure out real buffers
Longer Term Changes
• Architecture changes take time and
changes in organization
Improve Efficiency
• Rails/Ruby -> Scala & JVM
• 200-300 RPS -> 10,000-20,000
• Single process per request -> Finagle
Service Orientation
• Make changes at
interface
boundary, not in
single monolith
• Team interactions
simplified
• Core nouns and
verbs
Move out of public
cloud
• Flexibility and latency demand at some
point
• Hard problem
• Datacenter as failure domain
• Mesos
Dynamic Configuration
• Update routes and compare live vs
dark/new
• Quickly adjust to issues
• Faster and less fragile deploys
Improve storage
• Gizzard for MySQL
• Improve Memcached
• Storage as a service
• Snowflake IDs
Development Speed
• Startups live and die by development speed
• Make easier to ship but contain damage
Conclusion
• Fail whale is now an endangered species
• Went from event driven spikes to pushing
continuous reliability improvements where
events became trivial
Tweet Spikes Today
• New Tweets per second (TPS) record: 143,199 TPS.
Typical day: more than 500 million Tweets sent; average
5,700 TPS. (August 2 at 7:21:50 PDT;August 3 at
11:21:50 JST)
• https://blog.twitter.com/2013/new-tweets-per-second-record-and-how
Final Thoughts
• Marathon not a sprint. Maintain systems
and yourself
• We are hiring to make system even better
Endangered: Fail Whale
Jonathan
Reichhold
@jreichhold
Questions?
• https://blog.twitter.com/2013/new-tweets-per-seco
• https://blog.twitter.com/2013/observability-
at-twitter

More Related Content

What's hot

Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoJon Haddad
 
11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL Developers11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL DevelopersIke Ellis
 
MySQL 5.7 - the first few months
MySQL 5.7 - the first few monthsMySQL 5.7 - the first few months
MySQL 5.7 - the first few monthsSimon J Mudd
 
MySQL Failover and Orchestrator
MySQL Failover and OrchestratorMySQL Failover and Orchestrator
MySQL Failover and OrchestratorSimon J Mudd
 
Sustaining Your Career
Sustaining Your CareerSustaining Your Career
Sustaining Your CareerScott Lowe
 
Experiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for youExperiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for youSimon J Mudd
 
Containerization: The DevOps Revolution
Containerization: The DevOps Revolution Containerization: The DevOps Revolution
Containerization: The DevOps Revolution SoftServe
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser colleenfry
 
DevOps in the Real World
DevOps in the Real WorldDevOps in the Real World
DevOps in the Real WorldMax Yermakhanov
 
Tech writing in a continuous deployment environment
Tech writing in a continuous deployment environmentTech writing in a continuous deployment environment
Tech writing in a continuous deployment environmentChristine Burwinkle
 
New Server in an Hour
New Server in an HourNew Server in an Hour
New Server in an HourMike Hillwig
 
MySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the WireMySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the WireSimon J Mudd
 
You've Launched! Now What?
You've Launched! Now What?You've Launched! Now What?
You've Launched! Now What?Amye Scavarda
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraJon Haddad
 

What's hot (18)

Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day Toronto
 
CQRS
CQRSCQRS
CQRS
 
11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL Developers11 Goals of High Functioning SQL Developers
11 Goals of High Functioning SQL Developers
 
MySQL 5.7 - the first few months
MySQL 5.7 - the first few monthsMySQL 5.7 - the first few months
MySQL 5.7 - the first few months
 
MySQL Failover and Orchestrator
MySQL Failover and OrchestratorMySQL Failover and Orchestrator
MySQL Failover and Orchestrator
 
Sustaining Your Career
Sustaining Your CareerSustaining Your Career
Sustaining Your Career
 
Experiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for youExperiences testing dev versions of MySQL and why it is good for you
Experiences testing dev versions of MySQL and why it is good for you
 
Containerization: The DevOps Revolution
Containerization: The DevOps Revolution Containerization: The DevOps Revolution
Containerization: The DevOps Revolution
 
Kanban @ nine.ch
Kanban @ nine.chKanban @ nine.ch
Kanban @ nine.ch
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser
 
DevOps in the Real World
DevOps in the Real WorldDevOps in the Real World
DevOps in the Real World
 
Tech writing in a continuous deployment environment
Tech writing in a continuous deployment environmentTech writing in a continuous deployment environment
Tech writing in a continuous deployment environment
 
SPA vs. MPA
SPA vs. MPASPA vs. MPA
SPA vs. MPA
 
New Server in an Hour
New Server in an HourNew Server in an Hour
New Server in an Hour
 
In Memory Cahce Structure
In Memory Cahce StructureIn Memory Cahce Structure
In Memory Cahce Structure
 
MySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the WireMySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the Wire
 
You've Launched! Now What?
You've Launched! Now What?You've Launched! Now What?
You've Launched! Now What?
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 

Similar to #Surgeconf Scaling Twitter to go After the Fail Whale

DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...Matt Ray
 
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum PainKafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum Painconfluent
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...confluent
 
What we learned at pass summit in 2018
What we learned at pass summit in 2018What we learned at pass summit in 2018
What we learned at pass summit in 2018Red Gate Software
 
Project Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on DockerProject Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on DockerRightScale
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterJohn Adams
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applicationsAmit Kejriwal
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsAchievers Tech
 
E2 evc 3-2-1-rule - mikeresseler
E2 evc   3-2-1-rule - mikeresselerE2 evc   3-2-1-rule - mikeresseler
E2 evc 3-2-1-rule - mikeresselerMike Resseler
 
Industrial Data Science
Industrial Data ScienceIndustrial Data Science
Industrial Data ScienceNiko Vuokko
 
Stored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s GuideStored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s GuideVoltDB
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star SchemaDATAVERSITY
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that growGibraltar Software
 
Who wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and ResponsibilitiesWho wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and ResponsibilitiesKevin Kline
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like systemAree Oh
 

Similar to #Surgeconf Scaling Twitter to go After the Fail Whale (20)

DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum PainKafka Summit SF 2017 - Running Kafka for Maximum Pain
Kafka Summit SF 2017 - Running Kafka for Maximum Pain
 
Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1Into The Box 2020 Keynote Day 1
Into The Box 2020 Keynote Day 1
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
 
What we learned at pass summit in 2018
What we learned at pass summit in 2018What we learned at pass summit in 2018
What we learned at pass summit in 2018
 
Project Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on DockerProject Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on Docker
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Gaurav_CV
Gaurav_CVGaurav_CV
Gaurav_CV
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Dibi Conference 2012
Dibi Conference 2012Dibi Conference 2012
Dibi Conference 2012
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
E2 evc 3-2-1-rule - mikeresseler
E2 evc   3-2-1-rule - mikeresselerE2 evc   3-2-1-rule - mikeresseler
E2 evc 3-2-1-rule - mikeresseler
 
Industrial Data Science
Industrial Data ScienceIndustrial Data Science
Industrial Data Science
 
Stored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s GuideStored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s Guide
 
The Death of the Star Schema
The Death of the Star SchemaThe Death of the Star Schema
The Death of the Star Schema
 
Scaling Systems: Architectures that grow
Scaling Systems: Architectures that growScaling Systems: Architectures that grow
Scaling Systems: Architectures that grow
 
Who wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and ResponsibilitiesWho wants to be a DBA? Roles and Responsibilities
Who wants to be a DBA? Roles and Responsibilities
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like system
 

Recently uploaded

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 

Recently uploaded (20)

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 

#Surgeconf Scaling Twitter to go After the Fail Whale

Editor's Notes

  1. Introduce self.....
  2. Job of startups is to take risk and push quickly but learn from it. Twitter's genesis valued quick design and feature iteration over stability. We were known for the fail whale Events dominating Twitter traffic Limited engineers. Slow to ramp up with traffic (impedence mismatch)
  3. Limited resources (people, machines, and algorithms) Whale was a meme and what we were known for Rapid iteration on features and infrastructure 2010 FIFA World Cup forced an understanding of reliability as a feature on the organization. While we didn't completely break we suffered lots.
  4. Allowed us to iterate fast. Became a bottleneck
  5. Harden the code and algorithms for scale and nature of distributed system Changing the structure and systems to external stresses made the system more resilient to later challenges.  Fail whale extinct
  6. Most people start with pattern they know Nothing is learned in general
  7. Instead propose a hypothesis and test Did the system change as expected? Learn what works (and doesn’t) and use understanding for next iteration
  8. Be rigorous on recording data and validating Lots of time can be wasted on invalid data Signal vs noise False Alerts
  9. Being on-call is a job and during high stress will burn folks out Make sure you have good communications (phone numbers, IRC/Campfire, IM, Chat, etc) that work for group Make sure folks are kept informed regularly and can focus on problems. Both above and below you Maintain calm and order