SlideShare a Scribd company logo
1 of 29
Download to read offline
ABOUT ME
- 14y. in IT
- 13y. in Node.js dev
- RnD Team Lead at WalkMe
- working with highload services
What is
highload?
it is when 2 servers are not enough
Why 2+?
Redundancy
1 service can shut
down or brake
0 downtime updates
You can update 1
service, while 2nd
will handle requests
2 is a minimum number of
servers even for
non-highload projects
When do you need
2+ servers?
- Customers are complaining about
performance
- Your metrics show performance
degradation
- Yes
- Any code optimization has
its limits
- At some point you will
reach your CPU capacity
with more users
Maybe optimize your app?
So adding more servers is the right
approach to handle more request?
Yes, but how many servers?
9 or 10?
Status codes
the more 2xx - the better
the less 5xx - the better
Backend latency
Preferably to respond under 200ms
To satisfy business needs
Be cost effective
The less we spend - the more money
business can get.
How to achieve this?
CPU
~40-60% avg utilization
Memory
<50% max utilization
Traffic pattern
This can affect our auto scaling
parameters
Active handles
Spikes of active handles can block
requests from being processed
Active requests
Spikes of active requests can block
requests from being processed
Event loop lag
can be reason, why we can’t handle
requests in time
Monitoring & auto scaling
$$$$
Case 1: traffic increases and
decreases gradually
$$$$
Case 2: traffic or/and CPU
usage increases and decreases
sporadically
$$
$$
$$
$$
- potential money
saving
Hard to auto scale such systems,
there are some heavy requests.
Possible solution - offload CPU heavy
tasks to offline jobs (workers,
separate deployments)
Node.js metrics: event loop lag
Hundreds of these can cause high event loop lag and
lead to app unresponsiveness.
Mitigation: add setImmediate() to your cycles
event loop lag in sync methods
I hope you are not using sync methods of fs.
Use async variations of methods everywhere.
Do not use it
Use it
How to capture these?
default metrics can be collected in register
of prom-client and later exposed by your
http server, so Prometheus can collect
them and display in Grafana
Exploring event loop lag
Avg event loop lag > 100ms is the case for investigation
Other default metrics, that are collected with
“collectDefaultMetrics”
https://github.com/siimon/prom-client/tree/master/lib/metrics
Debug specific pod and check types of handles
Incoming http requests
from load balancer
Outgoing connections to
3rd parties
'Number of active libuv handles grouped by handle type. Every handle type is C++ class
name.'
But can we optimize
app by itself?
Code improvements: batch writes
Kafka write example.
Batch operations are also supported by Kinesis,
DynamoDb, Aerospike and many more
Batch writes example
Can be applied to any 3rd party, that supports batch writes
Logs, what can go wrong?
100_000 * 3_600 = 0.36B/h
- How much you would pay
to DataDog for this?
- What network load this
will create?
- What CPU load this will
create?
- How would you navigate
through 0.36B of logs per
hour?
In highload this can become
mitigation 1: sample errors
You don’t need all 100_000 errors in your logs
mitigation 2: store statistics of errors
It’s important to know when and how many errors did you
have
Custom metrics with prometheus
Now combine these methods
Error messages should be
persistent
You will know exact
number of events that
happened
You still can find details
about the error, where it
happened
You should tune log rate
to your load. it can be any
number 0.00001%-100%
Conclusion
Horizontal scale is most effective way
to handle more requests
Use as little servers as possible
Use batch operations when possible
log only needed amount of logs
Offload heavy jobs to “offline workers”
Eliminate long blocking operations
Monitor everything
THANK YOU!
Time for questions!
Andrii Shumada
More talks:
https://eagleeye.github.io

More Related Content

Similar to "Surviving highload with Node.js", Andrii Shumada

Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Brian Brazil
 
X-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-conceptX-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-conceptAram Alipoor
 
Serverless meetup Auckland #6
Serverless meetup Auckland #6Serverless meetup Auckland #6
Serverless meetup Auckland #6Myles Henaghan
 
Performance Optimization in Large Systems - Cusec 2019
Performance Optimization in Large Systems - Cusec 2019Performance Optimization in Large Systems - Cusec 2019
Performance Optimization in Large Systems - Cusec 2019Pierre-Luc Maheu
 
High-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsHigh-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsRick Hightower
 
Building and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub SystemBuilding and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub SystemKapil Reddy
 
Deep Dive: AWS X-Ray London Summit 2017
Deep Dive: AWS X-Ray London Summit 2017Deep Dive: AWS X-Ray London Summit 2017
Deep Dive: AWS X-Ray London Summit 2017Randall Hunt
 
Scalability using Node.js
Scalability using Node.jsScalability using Node.js
Scalability using Node.jsratankadam
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Brian Brazil
 
Scalable Apache for Beginners
Scalable Apache for BeginnersScalable Apache for Beginners
Scalable Apache for Beginnerswebhostingguy
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Brian Brazil
 
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsDhaval Shah
 
Server Monitoring (Scaling while bootstrapped)
Server Monitoring  (Scaling while bootstrapped)Server Monitoring  (Scaling while bootstrapped)
Server Monitoring (Scaling while bootstrapped)Ajibola Aiyedogbon
 
Serverless Computing
Serverless ComputingServerless Computing
Serverless ComputingAnand Gupta
 
Introduce AWS Lambda for newbie and Non-IT
Introduce AWS Lambda for newbie and Non-ITIntroduce AWS Lambda for newbie and Non-IT
Introduce AWS Lambda for newbie and Non-ITChitpong Wuttanan
 
Introduction to requirement of microservices
Introduction to requirement of microservicesIntroduction to requirement of microservices
Introduction to requirement of microservicesAvik Das
 
Operations: Production Readiness
Operations: Production ReadinessOperations: Production Readiness
Operations: Production ReadinessAmazon Web Services
 
Cloud Native & Service Mesh
Cloud Native & Service MeshCloud Native & Service Mesh
Cloud Native & Service MeshRoi Ezra
 
Richardrodger nodeday-2014-final
Richardrodger nodeday-2014-finalRichardrodger nodeday-2014-final
Richardrodger nodeday-2014-finalRichard Rodger
 

Similar to "Surviving highload with Node.js", Andrii Shumada (20)

Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
X-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-conceptX-Ray distributed tracing proof-of-concept
X-Ray distributed tracing proof-of-concept
 
Serverless meetup Auckland #6
Serverless meetup Auckland #6Serverless meetup Auckland #6
Serverless meetup Auckland #6
 
Performance Optimization in Large Systems - Cusec 2019
Performance Optimization in Large Systems - Cusec 2019Performance Optimization in Large Systems - Cusec 2019
Performance Optimization in Large Systems - Cusec 2019
 
High-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsHigh-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulations
 
Going Serverless on AWS
Going Serverless on AWSGoing Serverless on AWS
Going Serverless on AWS
 
Building and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub SystemBuilding and Scaling a WebSockets Pubsub System
Building and Scaling a WebSockets Pubsub System
 
Deep Dive: AWS X-Ray London Summit 2017
Deep Dive: AWS X-Ray London Summit 2017Deep Dive: AWS X-Ray London Summit 2017
Deep Dive: AWS X-Ray London Summit 2017
 
Scalability using Node.js
Scalability using Node.jsScalability using Node.js
Scalability using Node.js
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
 
Scalable Apache for Beginners
Scalable Apache for BeginnersScalable Apache for Beginners
Scalable Apache for Beginners
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
 
Enterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & LearningsEnterprise application performance - Understanding & Learnings
Enterprise application performance - Understanding & Learnings
 
Server Monitoring (Scaling while bootstrapped)
Server Monitoring  (Scaling while bootstrapped)Server Monitoring  (Scaling while bootstrapped)
Server Monitoring (Scaling while bootstrapped)
 
Serverless Computing
Serverless ComputingServerless Computing
Serverless Computing
 
Introduce AWS Lambda for newbie and Non-IT
Introduce AWS Lambda for newbie and Non-ITIntroduce AWS Lambda for newbie and Non-IT
Introduce AWS Lambda for newbie and Non-IT
 
Introduction to requirement of microservices
Introduction to requirement of microservicesIntroduction to requirement of microservices
Introduction to requirement of microservices
 
Operations: Production Readiness
Operations: Production ReadinessOperations: Production Readiness
Operations: Production Readiness
 
Cloud Native & Service Mesh
Cloud Native & Service MeshCloud Native & Service Mesh
Cloud Native & Service Mesh
 
Richardrodger nodeday-2014-final
Richardrodger nodeday-2014-finalRichardrodger nodeday-2014-final
Richardrodger nodeday-2014-final
 

More from Fwdays

"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y..."How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...Fwdays
 
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil TopchiiFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"What is a RAG system and how to build it",Dmytro Spodarets
"What is a RAG system and how to build it",Dmytro Spodarets"What is a RAG system and how to build it",Dmytro Spodarets
"What is a RAG system and how to build it",Dmytro SpodaretsFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Distributed graphs and microservices in Prom.ua", Maksym Kindritskyi
"Distributed graphs and microservices in Prom.ua",  Maksym Kindritskyi"Distributed graphs and microservices in Prom.ua",  Maksym Kindritskyi
"Distributed graphs and microservices in Prom.ua", Maksym KindritskyiFwdays
 
"Rethinking the existing data loading and processing process as an ETL exampl...
"Rethinking the existing data loading and processing process as an ETL exampl..."Rethinking the existing data loading and processing process as an ETL exampl...
"Rethinking the existing data loading and processing process as an ETL exampl...Fwdays
 
"How Ukrainian IT specialist can go on vacation abroad without crossing the T...
"How Ukrainian IT specialist can go on vacation abroad without crossing the T..."How Ukrainian IT specialist can go on vacation abroad without crossing the T...
"How Ukrainian IT specialist can go on vacation abroad without crossing the T...Fwdays
 
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ..."The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...Fwdays
 
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu..."[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...Fwdays
 
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care..."[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...Fwdays
 
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"..."4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...Fwdays
 
"Reconnecting with Purpose: Rediscovering Job Interest after Burnout", Anast...
"Reconnecting with Purpose: Rediscovering Job Interest after Burnout",  Anast..."Reconnecting with Purpose: Rediscovering Job Interest after Burnout",  Anast...
"Reconnecting with Purpose: Rediscovering Job Interest after Burnout", Anast...Fwdays
 
"Mentoring 101: How to effectively invest experience in the success of others...
"Mentoring 101: How to effectively invest experience in the success of others..."Mentoring 101: How to effectively invest experience in the success of others...
"Mentoring 101: How to effectively invest experience in the success of others...Fwdays
 
"Mission (im) possible: How to get an offer in 2024?", Oleksandra Myronova
"Mission (im) possible: How to get an offer in 2024?",  Oleksandra Myronova"Mission (im) possible: How to get an offer in 2024?",  Oleksandra Myronova
"Mission (im) possible: How to get an offer in 2024?", Oleksandra MyronovaFwdays
 
"Why have we learned how to package products, but not how to 'package ourselv...
"Why have we learned how to package products, but not how to 'package ourselv..."Why have we learned how to package products, but not how to 'package ourselv...
"Why have we learned how to package products, but not how to 'package ourselv...Fwdays
 
"How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin...
"How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin..."How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin...
"How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin...Fwdays
 

More from Fwdays (20)

"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y..."How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
"How Preply reduced ML model development time from 1 month to 1 day",Yevhen Y...
 
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
"GenAI Apps: Our Journey from Ideas to Production Excellence",Danil Topchii
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"What is a RAG system and how to build it",Dmytro Spodarets
"What is a RAG system and how to build it",Dmytro Spodarets"What is a RAG system and how to build it",Dmytro Spodarets
"What is a RAG system and how to build it",Dmytro Spodarets
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Distributed graphs and microservices in Prom.ua", Maksym Kindritskyi
"Distributed graphs and microservices in Prom.ua",  Maksym Kindritskyi"Distributed graphs and microservices in Prom.ua",  Maksym Kindritskyi
"Distributed graphs and microservices in Prom.ua", Maksym Kindritskyi
 
"Rethinking the existing data loading and processing process as an ETL exampl...
"Rethinking the existing data loading and processing process as an ETL exampl..."Rethinking the existing data loading and processing process as an ETL exampl...
"Rethinking the existing data loading and processing process as an ETL exampl...
 
"How Ukrainian IT specialist can go on vacation abroad without crossing the T...
"How Ukrainian IT specialist can go on vacation abroad without crossing the T..."How Ukrainian IT specialist can go on vacation abroad without crossing the T...
"How Ukrainian IT specialist can go on vacation abroad without crossing the T...
 
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ..."The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
"The Strength of Being Vulnerable: the experience from CIA, Tesla and Uber", ...
 
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu..."[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
"[QUICK TALK] Radical candor: how to achieve results faster thanks to a cultu...
 
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care..."[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
"[QUICK TALK] PDP Plan, the only one door to raise your salary and boost care...
 
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"..."4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
"4 horsemen of the apocalypse of working relationships (+ antidotes to them)"...
 
"Reconnecting with Purpose: Rediscovering Job Interest after Burnout", Anast...
"Reconnecting with Purpose: Rediscovering Job Interest after Burnout",  Anast..."Reconnecting with Purpose: Rediscovering Job Interest after Burnout",  Anast...
"Reconnecting with Purpose: Rediscovering Job Interest after Burnout", Anast...
 
"Mentoring 101: How to effectively invest experience in the success of others...
"Mentoring 101: How to effectively invest experience in the success of others..."Mentoring 101: How to effectively invest experience in the success of others...
"Mentoring 101: How to effectively invest experience in the success of others...
 
"Mission (im) possible: How to get an offer in 2024?", Oleksandra Myronova
"Mission (im) possible: How to get an offer in 2024?",  Oleksandra Myronova"Mission (im) possible: How to get an offer in 2024?",  Oleksandra Myronova
"Mission (im) possible: How to get an offer in 2024?", Oleksandra Myronova
 
"Why have we learned how to package products, but not how to 'package ourselv...
"Why have we learned how to package products, but not how to 'package ourselv..."Why have we learned how to package products, but not how to 'package ourselv...
"Why have we learned how to package products, but not how to 'package ourselv...
 
"How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin...
"How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin..."How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin...
"How to tame the dragon, or leadership with imposter syndrome", Oleksandr Zin...
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

"Surviving highload with Node.js", Andrii Shumada

  • 1.
  • 2. ABOUT ME - 14y. in IT - 13y. in Node.js dev - RnD Team Lead at WalkMe - working with highload services
  • 4. it is when 2 servers are not enough
  • 5. Why 2+? Redundancy 1 service can shut down or brake 0 downtime updates You can update 1 service, while 2nd will handle requests 2 is a minimum number of servers even for non-highload projects
  • 6. When do you need 2+ servers? - Customers are complaining about performance - Your metrics show performance degradation
  • 7. - Yes - Any code optimization has its limits - At some point you will reach your CPU capacity with more users Maybe optimize your app?
  • 8. So adding more servers is the right approach to handle more request?
  • 9. Yes, but how many servers? 9 or 10?
  • 10. Status codes the more 2xx - the better the less 5xx - the better Backend latency Preferably to respond under 200ms To satisfy business needs Be cost effective The less we spend - the more money business can get. How to achieve this?
  • 11. CPU ~40-60% avg utilization Memory <50% max utilization Traffic pattern This can affect our auto scaling parameters Active handles Spikes of active handles can block requests from being processed Active requests Spikes of active requests can block requests from being processed Event loop lag can be reason, why we can’t handle requests in time Monitoring & auto scaling
  • 12. $$$$ Case 1: traffic increases and decreases gradually $$$$
  • 13. Case 2: traffic or/and CPU usage increases and decreases sporadically $$ $$ $$ $$ - potential money saving Hard to auto scale such systems, there are some heavy requests. Possible solution - offload CPU heavy tasks to offline jobs (workers, separate deployments)
  • 14. Node.js metrics: event loop lag Hundreds of these can cause high event loop lag and lead to app unresponsiveness. Mitigation: add setImmediate() to your cycles
  • 15. event loop lag in sync methods I hope you are not using sync methods of fs. Use async variations of methods everywhere. Do not use it Use it
  • 16. How to capture these? default metrics can be collected in register of prom-client and later exposed by your http server, so Prometheus can collect them and display in Grafana
  • 17. Exploring event loop lag Avg event loop lag > 100ms is the case for investigation
  • 18. Other default metrics, that are collected with “collectDefaultMetrics” https://github.com/siimon/prom-client/tree/master/lib/metrics
  • 19. Debug specific pod and check types of handles Incoming http requests from load balancer Outgoing connections to 3rd parties 'Number of active libuv handles grouped by handle type. Every handle type is C++ class name.'
  • 20. But can we optimize app by itself?
  • 21. Code improvements: batch writes Kafka write example. Batch operations are also supported by Kinesis, DynamoDb, Aerospike and many more
  • 22. Batch writes example Can be applied to any 3rd party, that supports batch writes
  • 23. Logs, what can go wrong? 100_000 * 3_600 = 0.36B/h - How much you would pay to DataDog for this? - What network load this will create? - What CPU load this will create? - How would you navigate through 0.36B of logs per hour? In highload this can become
  • 24. mitigation 1: sample errors You don’t need all 100_000 errors in your logs
  • 25. mitigation 2: store statistics of errors It’s important to know when and how many errors did you have
  • 26. Custom metrics with prometheus
  • 27. Now combine these methods Error messages should be persistent You will know exact number of events that happened You still can find details about the error, where it happened You should tune log rate to your load. it can be any number 0.00001%-100%
  • 28. Conclusion Horizontal scale is most effective way to handle more requests Use as little servers as possible Use batch operations when possible log only needed amount of logs Offload heavy jobs to “offline workers” Eliminate long blocking operations Monitor everything
  • 29. THANK YOU! Time for questions! Andrii Shumada More talks: https://eagleeye.github.io