SlideShare a Scribd company logo
PHP at 5000 Requests / Sec 
Hootsuite’s Scaling Story 
Bill Monkman 
Lead Technical Engineer - Platform 
@bmonkman
Overview - Selected Current Architecture 
Users lb1 lb2 lb3 ... Nginx Load balancers 
web1 web2 web3 ... Nginx web servers 
PHP-FPM PHP-FPM PHP-FPM PHP-FPM 
Memcached cluster 
mem1 ... 
Mysql cluster 
master slave 
MongoDB cluster 
master slave 
master slave 
shard1 
shard2 
Gearman cluster 
geard1 geard2 
worker1 ... ... 
... 
Services
Technologies - at first 
• Apache 
• PHP 
• MySQL
Then...
Problem 
It’s hard to scale MySQL horizontally
Solution - Caching 
Memcached. 
● Distributed cache, cluster of boxes with lots of RAM, trivial to scale 
● Cache as much as possible, invalidate only when necessary 
● Use cache instead of DB 
● No joins - decouple entities (collection caching) 
● Twemproxy!
“There are only two hard things in 
Computer Science: cache invalidation and 
naming things.” 
• Phil Karlton
Solution - Caching 
MvcModelBaseCaching 
MvcModelBase 
MvcModelMysql 
SocialNetwork
Solution - Caching 
SELECT * FROM member WHERE org_id=888 
set individual cache records 
member_1 {data} 
member_5 {data} 
member_9 {data} 
set collection cache 
member_org_888 [1,5,9] 
Automatic invalidation of collection cache
Solution - Caching 
It’s hard to scale MySQL horizontally 
Now: 
● No need to scale MySQL 
● Able to serve the whole site on 1 MySQL server 
● 500 MySQL SELECTs per second. 50,000 Memcached GETs. 
● 99+% hit rate
Then...
Problem 
Need a way to perform asynchronous, distributed tasks using a 
single-threaded language.
Solution - Gearman 
Gearman. 
● Distribute work to other servers to handle (workers also using 
PHP, same codebase) 
● Precursor to SOA where everything is truly distributed 
● Many other solutions, queueing systems.
Solution - Gearman 
geard1 geard2 
gearworker1 gearworker2 gearworker6
Solution - Gearman 
Need a way to perform asynchronous, distributed tasks using a 
single-threaded language. 
Now: 
● Moved key tasks to Gearman 
● Another cluster, scalable separately from web 
● Discrete tasks, callable sync or async
Then...
Problem 
Need to store data with the potential to grow too big to handle 
effectively with MySQL.
Solution - MongoDB 
MongoDB. 
● Certain data did not need to be highly relational 
● NoSQL DB, many other solutions these days 
● Mongo can be a pain, lots of moving parts 
● Had to make our own sequencer where auto-incremented ids were 
necessary
Solution - MongoDB 
Need to store data with the potential to grow too big to handle 
effectively with MySQL. 
Now: 
● Multiple clusters containing amounts of data that likely would 
have crushed MySQL 
● Billions of rows per collection, many TB of data on disk
Technologies 
• Apache 
• PHP 
• MySQL 
• Memcached 
• Gearman 
• MongoDB
Then...
Problem 
With a codebase and an engineering team increasing in size, how do 
we keep up the pace of development and maintain control of the 
system? 
(SVN, big branches, merge hell)
Solution - Dark Launching 
Dark Launching. 
● Wrap code in block with a specific name 
● That name will appear in a management page 
● Can control whether or not that block is executed by modifying it’s value 
● Boolean , random percentage, session-based, member list, organization 
list, etc.
Solution - Dark Launching 
if (In_Feature::isEnabled(‘TWITTER_ADS’)) { 
// execute new code 
} else { 
// execute old code 
}
Dark Launching - Reasons 
• Control your code 
• Limit risk -> raise confidence -> speed up pace of releases 
• “Branching in Production” 
• Learning happens in Production
Solution - Dark Launching 
With a codebase and an engineering team increasing in size, how do 
we keep up the pace of development and maintain control of the 
system? 
Now: 
● Work fast with more confidence 
● Huge amount of control over production systems 
● Typically 10+ code releases to production per day 
● Push-based distribution with Consul
Then...
Problem 
With a rapidly increasing codebase and amount of users / traffic 
how do we keep visibility into the performance of the code?
Solution - Monitoring 
Statsd / Graphite. 
Logstash / Elasticsearch / Kibana. 
Sensu 
● Statsd for metrics 
● Logstash for log events 
● Sensu for monitoring / alerting
Solution - Monitoring 
Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);
Solution - Monitoring 
Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);
Solution - Monitoring 
• Visibility into the performance and behaviour of your application 
• Iterate upon your code, measure results 
• Pairs well with dark launching 
• Also systems like New Relic
Solution - Monitoring 
With a rapidly increasing codebase and amount of users / traffic 
how do we keep visibility into the performance of the code? 
Now: 
● Able to watch performance / behaviour in real time. 
● Able to view important events both in the aggregate or very 
granular 
● Able to control the system and watch the effect of changes
Optimizations
Optimizations 
• Things expand beyond their initial scope 
• Case in point: Translations
Optimizations - Push work to users 
• Within reason, push work up to users 
• Make your users into a distributed processing grid 
• e.g. Stream rendering
Optimizations - Performance / Risks 
• Performance is more important than clean code, business reqts 
(in the instances where they may be mutually exclusive) 
• Fine line between future proofing and premature optimization 
• Don’t add burdensome processes, but make it easy for your team 
to do things the right way 
• Know your weak spots, protect against abuse
Technologies 
Linux 
Nginx 
ElasticSearch Varnish 
PHP-FPM 
MySQL 
Jenkins 
Scala 
MongoDB 
Consul 
Gearman 
Redis 
Akka 
Python 
Memcached 
HAProxy 
jQuery 
ZeroMQ 
Backbone RabbitMQ 
EC2 
Zend 
Docker 
Cloudfront CDN 
Logstash 
Zookeeper 
Kibana 
Statsd/Graphite 
Packer 
Vagrant 
Nagios 
VirtualBox 
Spark/Shark 
Sensu 
Symfony 
Riak 
Composer 
Websockets 
Comet 
Hadoop 
Ansible 
Git 
Webpack Redshift
Problem 
With a huge and growing monolithic codebase and over 80 
engineers, how to keep scaling in a manageable way?
Solution - SOA 
SOA. 
● Split up the system into independent services which communicate only via APIs 
● Teams can work on their own services with encapsulated business logic and have their own 
deployment schedules. 
● We chose to use Scala/Akka for services, communicating via ZeroMQ 
● SOA transition made easier by the “no joins” philosophy 
● Tons of work
Solution - SOA 
SOM. 
● “Service Oriented Monolith” 
● When splitting up a monolithic codebase, dependencies are what kill you 
● Fulfill dependencies by writing interim services using existing PHP code 
● Maintain the contract and future scala services will be drop-in 
replacements
Solution - SOA 
With a huge and growing monolithic codebase and over 130 
engineers, how to keep scaling in a manageable way? 
Today: 
● Transitioning to Scala SOA 
● PHP will still be used as the Façade, a thin layer built on top of 
the business logic of the services it interacts with.
Conclusion
Thank You! 
Bill Monkman 
@bmonkman 
More Info: 
code.hootsuite.com

More Related Content

What's hot

Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
Jason Hubbard
 
Лекция 3. Распределённая файловая система HDFS
Лекция 3. Распределённая файловая система HDFSЛекция 3. Распределённая файловая система HDFS
Лекция 3. Распределённая файловая система HDFS
Technopark
 
为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)
Kris Mok
 
المجاهدة للصفاء والمشاهدة
المجاهدة للصفاء والمشاهدةالمجاهدة للصفاء والمشاهدة
المجاهدة للصفاء والمشاهدة
Hassan Elagouz
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop Yarn
Mike Frampton
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Maarten Smeets
 
Blosc Talk by Francesc Alted from PyData London 2014
Blosc Talk by Francesc Alted from PyData London 2014Blosc Talk by Francesc Alted from PyData London 2014
Blosc Talk by Francesc Alted from PyData London 2014
PyData
 
Copy of MongoDB .pptx
Copy of MongoDB .pptxCopy of MongoDB .pptx
Copy of MongoDB .pptx
nehabsairam
 
Odoo - Create themes for website
Odoo - Create themes for websiteOdoo - Create themes for website
Odoo - Create themes for website
Odoo
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
FS2 mongo reactivestreams
FS2 mongo reactivestreamsFS2 mongo reactivestreams
FS2 mongo reactivestreams
yann_s
 

What's hot (13)

Spark Tips & Tricks
Spark Tips & TricksSpark Tips & Tricks
Spark Tips & Tricks
 
Лекция 3. Распределённая файловая система HDFS
Лекция 3. Распределённая файловая система HDFSЛекция 3. Распределённая файловая система HDFS
Лекция 3. Распределённая файловая система HDFS
 
为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)
 
المجاهدة للصفاء والمشاهدة
المجاهدة للصفاء والمشاهدةالمجاهدة للصفاء والمشاهدة
المجاهدة للصفاء والمشاهدة
 
uses of ict
uses of ictuses of ict
uses of ict
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
An Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop YarnAn Introduction to Apache Hadoop Yarn
An Introduction to Apache Hadoop Yarn
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Blosc Talk by Francesc Alted from PyData London 2014
Blosc Talk by Francesc Alted from PyData London 2014Blosc Talk by Francesc Alted from PyData London 2014
Blosc Talk by Francesc Alted from PyData London 2014
 
Copy of MongoDB .pptx
Copy of MongoDB .pptxCopy of MongoDB .pptx
Copy of MongoDB .pptx
 
Odoo - Create themes for website
Odoo - Create themes for websiteOdoo - Create themes for website
Odoo - Create themes for website
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
FS2 mongo reactivestreams
FS2 mongo reactivestreamsFS2 mongo reactivestreams
FS2 mongo reactivestreams
 

Similar to PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
corehard_by
 
About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014
Michal Harish
 
Monitoring with Clickhouse
Monitoring with ClickhouseMonitoring with Clickhouse
Monitoring with Clickhouse
unicast
 
3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...
Timothy McCormick
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Sergey Platonov
 
Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
Juan Berner
 
Mongo db - How we use Go and MongoDB by Sam Helman
Mongo db - How we use Go and MongoDB by Sam HelmanMongo db - How we use Go and MongoDB by Sam Helman
Mongo db - How we use Go and MongoDB by Sam Helman
Hakka Labs
 
JSFest 2019: Technology agnostic microservices at SPA frontend
JSFest 2019: Technology agnostic microservices at SPA frontendJSFest 2019: Technology agnostic microservices at SPA frontend
JSFest 2019: Technology agnostic microservices at SPA frontend
Vlad Fedosov
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
Anton Nazaruk
 
Scaling symfony apps
Scaling symfony appsScaling symfony apps
Scaling symfony apps
Matteo Moretti
 
From prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.ioFrom prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.io
Máté Lang
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
LibbySchulze
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Pablo Garbossa
 
Not my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureNot my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructure
Yshay Yaacobi
 
JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...
JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...
JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...
JSFestUA
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
Peter Lawrey
 
There is something about serverless
There is something about serverlessThere is something about serverless
There is something about serverless
gjdevos
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
Alex Moskvin
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
Luciano Mammino
 
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Pôle Systematic Paris-Region
 

Similar to PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story (20)

Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
 
About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014
 
Monitoring with Clickhouse
Monitoring with ClickhouseMonitoring with Clickhouse
Monitoring with Clickhouse
 
3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...3450 - Writing and optimising applications for performance in a hybrid messag...
3450 - Writing and optimising applications for performance in a hybrid messag...
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...
 
Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
 
Mongo db - How we use Go and MongoDB by Sam Helman
Mongo db - How we use Go and MongoDB by Sam HelmanMongo db - How we use Go and MongoDB by Sam Helman
Mongo db - How we use Go and MongoDB by Sam Helman
 
JSFest 2019: Technology agnostic microservices at SPA frontend
JSFest 2019: Technology agnostic microservices at SPA frontendJSFest 2019: Technology agnostic microservices at SPA frontend
JSFest 2019: Technology agnostic microservices at SPA frontend
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Scaling symfony apps
Scaling symfony appsScaling symfony apps
Scaling symfony apps
 
From prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.ioFrom prototype to production - The journey of re-designing SmartUp.io
From prototype to production - The journey of re-designing SmartUp.io
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
 
Not my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructureNot my problem - Delegating responsibility to infrastructure
Not my problem - Delegating responsibility to infrastructure
 
JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...
JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...
JS Fest 2019/Autumn. Влад Федосов. Technology agnostic microservices at SPA f...
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
 
There is something about serverless
There is something about serverlessThere is something about serverless
There is something about serverless
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
 
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
Building a high-performance, scalable ML & NLP platform with Python, Sheer El...
 

Recently uploaded

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story

  • 1. PHP at 5000 Requests / Sec Hootsuite’s Scaling Story Bill Monkman Lead Technical Engineer - Platform @bmonkman
  • 2.
  • 3. Overview - Selected Current Architecture Users lb1 lb2 lb3 ... Nginx Load balancers web1 web2 web3 ... Nginx web servers PHP-FPM PHP-FPM PHP-FPM PHP-FPM Memcached cluster mem1 ... Mysql cluster master slave MongoDB cluster master slave master slave shard1 shard2 Gearman cluster geard1 geard2 worker1 ... ... ... Services
  • 4. Technologies - at first • Apache • PHP • MySQL
  • 6. Problem It’s hard to scale MySQL horizontally
  • 7. Solution - Caching Memcached. ● Distributed cache, cluster of boxes with lots of RAM, trivial to scale ● Cache as much as possible, invalidate only when necessary ● Use cache instead of DB ● No joins - decouple entities (collection caching) ● Twemproxy!
  • 8. “There are only two hard things in Computer Science: cache invalidation and naming things.” • Phil Karlton
  • 9. Solution - Caching MvcModelBaseCaching MvcModelBase MvcModelMysql SocialNetwork
  • 10. Solution - Caching SELECT * FROM member WHERE org_id=888 set individual cache records member_1 {data} member_5 {data} member_9 {data} set collection cache member_org_888 [1,5,9] Automatic invalidation of collection cache
  • 11. Solution - Caching It’s hard to scale MySQL horizontally Now: ● No need to scale MySQL ● Able to serve the whole site on 1 MySQL server ● 500 MySQL SELECTs per second. 50,000 Memcached GETs. ● 99+% hit rate
  • 13. Problem Need a way to perform asynchronous, distributed tasks using a single-threaded language.
  • 14. Solution - Gearman Gearman. ● Distribute work to other servers to handle (workers also using PHP, same codebase) ● Precursor to SOA where everything is truly distributed ● Many other solutions, queueing systems.
  • 15. Solution - Gearman geard1 geard2 gearworker1 gearworker2 gearworker6
  • 16. Solution - Gearman Need a way to perform asynchronous, distributed tasks using a single-threaded language. Now: ● Moved key tasks to Gearman ● Another cluster, scalable separately from web ● Discrete tasks, callable sync or async
  • 18. Problem Need to store data with the potential to grow too big to handle effectively with MySQL.
  • 19. Solution - MongoDB MongoDB. ● Certain data did not need to be highly relational ● NoSQL DB, many other solutions these days ● Mongo can be a pain, lots of moving parts ● Had to make our own sequencer where auto-incremented ids were necessary
  • 20. Solution - MongoDB Need to store data with the potential to grow too big to handle effectively with MySQL. Now: ● Multiple clusters containing amounts of data that likely would have crushed MySQL ● Billions of rows per collection, many TB of data on disk
  • 21. Technologies • Apache • PHP • MySQL • Memcached • Gearman • MongoDB
  • 23. Problem With a codebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? (SVN, big branches, merge hell)
  • 24. Solution - Dark Launching Dark Launching. ● Wrap code in block with a specific name ● That name will appear in a management page ● Can control whether or not that block is executed by modifying it’s value ● Boolean , random percentage, session-based, member list, organization list, etc.
  • 25. Solution - Dark Launching if (In_Feature::isEnabled(‘TWITTER_ADS’)) { // execute new code } else { // execute old code }
  • 26. Dark Launching - Reasons • Control your code • Limit risk -> raise confidence -> speed up pace of releases • “Branching in Production” • Learning happens in Production
  • 27. Solution - Dark Launching With a codebase and an engineering team increasing in size, how do we keep up the pace of development and maintain control of the system? Now: ● Work fast with more confidence ● Huge amount of control over production systems ● Typically 10+ code releases to production per day ● Push-based distribution with Consul
  • 29. Problem With a rapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code?
  • 30. Solution - Monitoring Statsd / Graphite. Logstash / Elasticsearch / Kibana. Sensu ● Statsd for metrics ● Logstash for log events ● Sensu for monitoring / alerting
  • 31. Solution - Monitoring Statsd::timing('apiCall.facebookGraph', microtime(true) - $startTime);
  • 32. Solution - Monitoring Logger::event('user liked from in-stream', In_Log::CATEGORY_UX, $logData);
  • 33. Solution - Monitoring • Visibility into the performance and behaviour of your application • Iterate upon your code, measure results • Pairs well with dark launching • Also systems like New Relic
  • 34. Solution - Monitoring With a rapidly increasing codebase and amount of users / traffic how do we keep visibility into the performance of the code? Now: ● Able to watch performance / behaviour in real time. ● Able to view important events both in the aggregate or very granular ● Able to control the system and watch the effect of changes
  • 36. Optimizations • Things expand beyond their initial scope • Case in point: Translations
  • 37.
  • 38.
  • 39. Optimizations - Push work to users • Within reason, push work up to users • Make your users into a distributed processing grid • e.g. Stream rendering
  • 40. Optimizations - Performance / Risks • Performance is more important than clean code, business reqts (in the instances where they may be mutually exclusive) • Fine line between future proofing and premature optimization • Don’t add burdensome processes, but make it easy for your team to do things the right way • Know your weak spots, protect against abuse
  • 41.
  • 42. Technologies Linux Nginx ElasticSearch Varnish PHP-FPM MySQL Jenkins Scala MongoDB Consul Gearman Redis Akka Python Memcached HAProxy jQuery ZeroMQ Backbone RabbitMQ EC2 Zend Docker Cloudfront CDN Logstash Zookeeper Kibana Statsd/Graphite Packer Vagrant Nagios VirtualBox Spark/Shark Sensu Symfony Riak Composer Websockets Comet Hadoop Ansible Git Webpack Redshift
  • 43. Problem With a huge and growing monolithic codebase and over 80 engineers, how to keep scaling in a manageable way?
  • 44. Solution - SOA SOA. ● Split up the system into independent services which communicate only via APIs ● Teams can work on their own services with encapsulated business logic and have their own deployment schedules. ● We chose to use Scala/Akka for services, communicating via ZeroMQ ● SOA transition made easier by the “no joins” philosophy ● Tons of work
  • 45. Solution - SOA SOM. ● “Service Oriented Monolith” ● When splitting up a monolithic codebase, dependencies are what kill you ● Fulfill dependencies by writing interim services using existing PHP code ● Maintain the contract and future scala services will be drop-in replacements
  • 46. Solution - SOA With a huge and growing monolithic codebase and over 130 engineers, how to keep scaling in a manageable way? Today: ● Transitioning to Scala SOA ● PHP will still be used as the Façade, a thin layer built on top of the business logic of the services it interacts with.
  • 48. Thank You! Bill Monkman @bmonkman More Info: code.hootsuite.com