SlideShare a Scribd company logo
RATE LIMITING
THROTTLING
WHAT IS RATE LIMITING
 Let's say you are exposing a bunch of public RESTful APIs. And if you want to limit the number of requests to
be served over a period of time, in order to save resources and protect it from abuse.
 Say for example you want to allow only 60 calls to be made in a 1-minute window. To be able to do this, there
are many algos, we will discuss each of those in depth.
ALGOS
LEAKY BUCKET
 It is a Queue which takes request in First in First Out(FIFO) way.
 Once, Queue is filled. server drops the upcoming request until the
queue has space to take more request.
HOW IT WORKS
 For Example, Server gets request 1,2,3 and 4. Based on the Queue size. it takes in the request. consider the size
of queue as 4. it will take requests 1,2,3 and 4.
 After that, server gets request 5. it will drop it.
 As in below image, the queue size is 2 and receives 2 request, Once the queue is full, then additional 3rd , 4th,
5th and 6th requests are discarded (or leaked).
CONS
 A burst of traffic can fill up the queue with old requests and more recent requests will starve from being
processed. It also provides no guarantee that requests get processed in a fixed amount of time.
TOKEN BUCKET ALGORITHM
 For each unique user, we would record their last request’s Unix timestamp and available token count within a
hash in Redis.
 Whenever a new request arrives from a user, the rate limiter would have to do a 2 things to track usage.
 Firstly, It would fetch the hash from Redis and refill the available tokens based on a chosen refill rate and the time of the user’s
last request.
 Then it would update the hash with the current request’s timestamp and the new available token count.
User 1 has two tokens left in their token bucket and made their last request on Thursday, March 30,
2017 at 10:00 GMT
CONS
 Despite the token bucket algorithm’s elegance and tiny memory footprint, its Redis operations aren’t atomic. In a distributed environment, the
“read-and-then-write” behaviour creates a race condition,
 Imagine if there was only one available token for a user and that user issued multiple requests. If two separate processes served each of these
requests and concurrently read the available token count before either of them updated it, each process would think the user had a single token
left and that they had not hit the rate limit.
 Our token bucket implementation could achieve atomicity if each process were to fetch a Redis lock for the duration of its Redis operations. This,
however, would come at the expense of slowing down concurrent requests from the same user and introducing another layer of complexity
FIXED WINDOW COUNTER
 It increments the request counter of an user for a particular time. if counter crosses a threshold. server drops
the request. it uses redis to store the request information.
 Unlike the token bucket algorithm, this approach’s Redis operations are atomic. Each request would increment
a Redis key that included the request’s timestamp. A given Redis key might look like this:
 When incrementing the request count for a given timestamp, we would compare its value to our rate limit to
decide whether to reject the request.
 We would also tell Redis to expire the key when the current minute passed to ensure that stale counters didn’t
stick around forever.
RATE LIMIT 2 REQUEST PER MINUTE
 A request comes at 00:00:24 belongs to window 1
and it increases the window’s counter to 1.
 The next request comes at 00:00:36 also belongs to
window 1 and the window’s counter becomes 2.
 The next request that comes at 00:00:49 is rejected
because the counter has exceeded the limit.
 Then the request comes at 00:01:12 can be served
because it belongs to window 2.
CONS
 Let's say that server gets lots of request at 55th second of a minute. this won't work as expected.
 For example, if our rate limit were 5 requests per minute and a user made 5 requests at 11:00:59, they could
make 5 more requests at 11:01:00 because a new counter begins at the start of each minute. Despite a rate
limit of 5 requests per minute, we’ve now allowed 10 requests in less than one minute!
SLIDING LOGS
 It stores the logs of each request with a timestamp in redis or in memory. For each request.
 we could efficiently track all of a user’s requests in a single sorted set. By inserting a new member into the
sorted set with a sort value of the Unix microsecond timestamp
SCENARIO
 When a request comes, we first pop all outdated timestamps before appending the new request time to the
log.
 Then we decide whether this request should be processed depending on whether the log size has exceeded
the limit. For example, suppose the rate limit is 2 requests per minute:
11:59:12 –
00:00:12
11:59:24 –
00:00:24
11:59:36 –
00:00:36 in this
minute
timeframe third
request is
rejected
00:00:25 – 00:01:25 – Request older than this
time frame is deleted.
CONS
 While the precision of the sliding window log approach may be useful for a developer API, it leaves a
considerably large memory footprint because it stores a value for every request.
 Let's say if application receives million request, maintaining log for each request in memory is expensive.
SLIDING WINDOW COUNTER
 This approach is somewhat similar to sliding logs. Only difference here is, Instead of storing all the logs,we
store by grouping user request data based on timestamp.
 For example, if we have an hourly rate limit, we increment counters specific to the current Unix minute
timestamp and calculate the sum of all counters in the past hour when we receive a new request.
 When each request increments a counter in the hash, it also sets the hash to expire an hour later.
RATE LIMITING IN DISTRIBUTED SYSTEMS
SYNCHRONIZATION POLICIES
 We have two load balancer in two different regions coupled with a Rate limiter. And our application is
deployed In two node cluster. We have a centralised DB Redis
Request
Load
Balancer
Load
Balancer
RL
RL
Redis
Node
2
Node
1
CONSISTENCY ISSUE
 The problem we have here is if a user sends request simultaneously in two different load balancer. And If each
node were to track its own rate limit, then a consumer could exceed a global rate limit when requests are sent
to different nodes via different load balancer.
 In fact, the greater the number of nodes, the more likely the user will be able to exceed the global limit.
 Assume your rate limit is 5 request per minute. Now for a given user we have server already 4 requests, and the rate limit counter
is set to 4 in Redis.
 Only one more request could be allowed to serve for that given minute.
 Now 2 new Request is being fired simultaneously via different nodes, and both rate limiter pick the latest rate limit counter value
from Redis as 4, and presume that they can allow the new request.
 Once that happens we break the rule and serve 6 request per minute instead of 5.
SOLUTION 1: STICKY SOLUTION
 The simplest way to enforce the limit is to set up sticky sessions in your load balancer so that each consumer gets sent to exactly one
node.
 The disadvantages include a lack of fault tolerance and scaling problems when nodes 2 get overloaded.
Request
Load
Balancer
Load
Balancer
RL
RL
Redis
Node2
Node
1
RACE CONDITION ISSUE AND SOLUTION
 One of the largest problems with a centralized data store is the potential for race conditions in high concurrency request patterns. This happens
when you use a naïve “get-then-set” approach, wherein you retrieve the current rate limit counter, increment it, and then push it back to the
datastore. The problem with this model is that in the time it takes to perform a full cycle of read-increment-store, additional requests can come
through, each attempting to store the increment counter with an invalid (lower) counter value. This allows a consumer sending a very high rate of
requests to bypass rate limiting controls.
 One way to avoid this problem is to put a “lock” around the key in Redis, preventing any other processes from accessing or writing to the
counter.
 However, this would quickly become a major performance bottleneck and add latency as one process should wait unless another process
releases its lock.
WHAT ELSE IS THE SOLUTION
 Eventual Consistency model
 Instead of every time fetching the counter from Redis global memory we can have a local memory cache in
each node or Rate Limiter and fetch the request count from the local memory.
 The counter value stored in local memory will be asynchronously synced global memory and also back
this case the local memory of both the respective nodes will be updated and in sync.
REFERENCE
 https://dev.to/ganeshmani/designing-a-scalable-api-rate-limiter-in-node-js-application-5hg3
 https://hechao.li/2018/06/25/Rate-Limiter-Part1/
 https://konghq.com/blog/how-to-design-a-scalable-rate-limiting-algorithm/
 https://www.youtube.com/watch?v=mhUQe4BKZXs&t=183s

More Related Content

What's hot

(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
Amazon Web Services
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Mihai Criveti
 
Detailed Analysis of AWS Lambda vs EC2
 Detailed Analysis of AWS Lambda vs EC2 Detailed Analysis of AWS Lambda vs EC2
Detailed Analysis of AWS Lambda vs EC2
Whizlabs
 
AWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless CloudAWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless Cloud
Amazon Web Services
 
Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
Databricks
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless Architecture
Elana Krasner
 
AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)
Ashish Kushwaha
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
Brendan Gregg
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Guozhang Wang
 
AWS RDS
AWS RDSAWS RDS
AWS RDS
Mahesh Raj
 
Site reliability engineering
Site reliability engineeringSite reliability engineering
Site reliability engineering
Jason Loeffler
 
AWS Lambda
AWS LambdaAWS Lambda
AWS Lambda
Scott Leberknight
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
How to improve ELK log pipeline performance
How to improve ELK log pipeline performanceHow to improve ELK log pipeline performance
How to improve ELK log pipeline performance
Steven Shim
 
NGINX: Basics & Best Practices - EMEA Broadcast
NGINX: Basics & Best Practices - EMEA BroadcastNGINX: Basics & Best Practices - EMEA Broadcast
NGINX: Basics & Best Practices - EMEA Broadcast
NGINX, Inc.
 
5 things you didn't know nginx could do
5 things you didn't know nginx could do5 things you didn't know nginx could do
5 things you didn't know nginx could do
sarahnovotny
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
Itiel Shwartz
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
Vikram Shinde
 
NGINX Plus on AWS
NGINX Plus on AWSNGINX Plus on AWS
NGINX Plus on AWS
Amazon Web Services
 

What's hot (20)

(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
 
Detailed Analysis of AWS Lambda vs EC2
 Detailed Analysis of AWS Lambda vs EC2 Detailed Analysis of AWS Lambda vs EC2
Detailed Analysis of AWS Lambda vs EC2
 
AWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless CloudAWS Lambda and the Serverless Cloud
AWS Lambda and the Serverless Cloud
 
Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless Architecture
 
AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)AWS Serverless Introduction (Lambda)
AWS Serverless Introduction (Lambda)
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
 
AWS RDS
AWS RDSAWS RDS
AWS RDS
 
Site reliability engineering
Site reliability engineeringSite reliability engineering
Site reliability engineering
 
AWS Lambda
AWS LambdaAWS Lambda
AWS Lambda
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
How to improve ELK log pipeline performance
How to improve ELK log pipeline performanceHow to improve ELK log pipeline performance
How to improve ELK log pipeline performance
 
NGINX: Basics & Best Practices - EMEA Broadcast
NGINX: Basics & Best Practices - EMEA BroadcastNGINX: Basics & Best Practices - EMEA Broadcast
NGINX: Basics & Best Practices - EMEA Broadcast
 
5 things you didn't know nginx could do
5 things you didn't know nginx could do5 things you didn't know nginx could do
5 things you didn't know nginx could do
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
 
NGINX Plus on AWS
NGINX Plus on AWSNGINX Plus on AWS
NGINX Plus on AWS
 

Similar to Rate limiting

Continuation_alan_20220503.pdf
Continuation_alan_20220503.pdfContinuation_alan_20220503.pdf
Continuation_alan_20220503.pdf
Shen yifeng
 
"Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays
 
Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...
Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...
Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...
IJERA Editor
 
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with RedisRedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
Redis Labs
 
Non functional performance requirements v2.2
Non functional performance requirements v2.2Non functional performance requirements v2.2
Non functional performance requirements v2.2
Ian McDonald
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
HBaseCon
 
Back-End application for Distributed systems
Back-End application for Distributed systemsBack-End application for Distributed systems
Back-End application for Distributed systems
Atif Imam
 
Cn
CnCn
High-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsHigh-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulations
Rick Hightower
 
EAI design patterns/best practices
EAI design patterns/best practicesEAI design patterns/best practices
EAI design patterns/best practices
Ajit Bhingarkar
 
Oracle WorkManager
Oracle WorkManagerOracle WorkManager
Oracle WorkManager
Giampiero Cerroni
 
Difference between Client Polling vs Server Push vs Websocket vs Long Polling
Difference between Client Polling vs Server Push vs Websocket vs Long PollingDifference between Client Polling vs Server Push vs Websocket vs Long Polling
Difference between Client Polling vs Server Push vs Websocket vs Long Polling
jeetendra mandal
 
High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017
Rick Hightower
 
Architecting for the cloud elasticity security
Architecting for the cloud elasticity securityArchitecting for the cloud elasticity security
Architecting for the cloud elasticity security
Len Bass
 
Beyond Off the-Shelf Consensus
Beyond Off the-Shelf ConsensusBeyond Off the-Shelf Consensus
Beyond Off the-Shelf Consensus
Rebecca Bilbro
 
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
Codemotion
 
A project on Clientserver SystemThis project is about a client
A project on Clientserver SystemThis project is about a clientA project on Clientserver SystemThis project is about a client
A project on Clientserver SystemThis project is about a client
latashiadegale
 
A project on Clientserver SystemThis project is about a client.docx
A project on Clientserver SystemThis project is about a client.docxA project on Clientserver SystemThis project is about a client.docx
A project on Clientserver SystemThis project is about a client.docx
makdul
 
Clustering - october 2006
Clustering  - october 2006Clustering  - october 2006
Clustering - october 2006
achraf_ing
 
Inter process communication
Inter process communicationInter process communication
Inter process communication
Tamer Rezk
 

Similar to Rate limiting (20)

Continuation_alan_20220503.pdf
Continuation_alan_20220503.pdfContinuation_alan_20220503.pdf
Continuation_alan_20220503.pdf
 
"Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
 
Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...
Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...
Analysis of a Pool Management Scheme for Cloud Computing Centres by Using Par...
 
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with RedisRedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
 
Non functional performance requirements v2.2
Non functional performance requirements v2.2Non functional performance requirements v2.2
Non functional performance requirements v2.2
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
Back-End application for Distributed systems
Back-End application for Distributed systemsBack-End application for Distributed systems
Back-End application for Distributed systems
 
Cn
CnCn
Cn
 
High-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsHigh-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulations
 
EAI design patterns/best practices
EAI design patterns/best practicesEAI design patterns/best practices
EAI design patterns/best practices
 
Oracle WorkManager
Oracle WorkManagerOracle WorkManager
Oracle WorkManager
 
Difference between Client Polling vs Server Push vs Websocket vs Long Polling
Difference between Client Polling vs Server Push vs Websocket vs Long PollingDifference between Client Polling vs Server Push vs Websocket vs Long Polling
Difference between Client Polling vs Server Push vs Websocket vs Long Polling
 
High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017
 
Architecting for the cloud elasticity security
Architecting for the cloud elasticity securityArchitecting for the cloud elasticity security
Architecting for the cloud elasticity security
 
Beyond Off the-Shelf Consensus
Beyond Off the-Shelf ConsensusBeyond Off the-Shelf Consensus
Beyond Off the-Shelf Consensus
 
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
 
A project on Clientserver SystemThis project is about a client
A project on Clientserver SystemThis project is about a clientA project on Clientserver SystemThis project is about a client
A project on Clientserver SystemThis project is about a client
 
A project on Clientserver SystemThis project is about a client.docx
A project on Clientserver SystemThis project is about a client.docxA project on Clientserver SystemThis project is about a client.docx
A project on Clientserver SystemThis project is about a client.docx
 
Clustering - october 2006
Clustering  - october 2006Clustering  - october 2006
Clustering - october 2006
 
Inter process communication
Inter process communicationInter process communication
Inter process communication
 

More from Viyaan Jhiingade

Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to Scala
Viyaan Jhiingade
 
No sql
No sqlNo sql
Rest Webservice
Rest WebserviceRest Webservice
Rest Webservice
Viyaan Jhiingade
 
Storm
StormStorm
Git commands
Git commandsGit commands
Git commands
Viyaan Jhiingade
 
Jenkins CI
Jenkins CIJenkins CI
Jenkins CI
Viyaan Jhiingade
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
Viyaan Jhiingade
 

More from Viyaan Jhiingade (7)

Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to Scala
 
No sql
No sqlNo sql
No sql
 
Rest Webservice
Rest WebserviceRest Webservice
Rest Webservice
 
Storm
StormStorm
Storm
 
Git commands
Git commandsGit commands
Git commands
 
Jenkins CI
Jenkins CIJenkins CI
Jenkins CI
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 

Rate limiting

  • 2. WHAT IS RATE LIMITING  Let's say you are exposing a bunch of public RESTful APIs. And if you want to limit the number of requests to be served over a period of time, in order to save resources and protect it from abuse.  Say for example you want to allow only 60 calls to be made in a 1-minute window. To be able to do this, there are many algos, we will discuss each of those in depth.
  • 4. LEAKY BUCKET  It is a Queue which takes request in First in First Out(FIFO) way.  Once, Queue is filled. server drops the upcoming request until the queue has space to take more request.
  • 5. HOW IT WORKS  For Example, Server gets request 1,2,3 and 4. Based on the Queue size. it takes in the request. consider the size of queue as 4. it will take requests 1,2,3 and 4.  After that, server gets request 5. it will drop it.  As in below image, the queue size is 2 and receives 2 request, Once the queue is full, then additional 3rd , 4th, 5th and 6th requests are discarded (or leaked).
  • 6. CONS  A burst of traffic can fill up the queue with old requests and more recent requests will starve from being processed. It also provides no guarantee that requests get processed in a fixed amount of time.
  • 7. TOKEN BUCKET ALGORITHM  For each unique user, we would record their last request’s Unix timestamp and available token count within a hash in Redis.  Whenever a new request arrives from a user, the rate limiter would have to do a 2 things to track usage.  Firstly, It would fetch the hash from Redis and refill the available tokens based on a chosen refill rate and the time of the user’s last request.  Then it would update the hash with the current request’s timestamp and the new available token count.
  • 8. User 1 has two tokens left in their token bucket and made their last request on Thursday, March 30, 2017 at 10:00 GMT
  • 9.
  • 10. CONS  Despite the token bucket algorithm’s elegance and tiny memory footprint, its Redis operations aren’t atomic. In a distributed environment, the “read-and-then-write” behaviour creates a race condition,  Imagine if there was only one available token for a user and that user issued multiple requests. If two separate processes served each of these requests and concurrently read the available token count before either of them updated it, each process would think the user had a single token left and that they had not hit the rate limit.  Our token bucket implementation could achieve atomicity if each process were to fetch a Redis lock for the duration of its Redis operations. This, however, would come at the expense of slowing down concurrent requests from the same user and introducing another layer of complexity
  • 11. FIXED WINDOW COUNTER  It increments the request counter of an user for a particular time. if counter crosses a threshold. server drops the request. it uses redis to store the request information.  Unlike the token bucket algorithm, this approach’s Redis operations are atomic. Each request would increment a Redis key that included the request’s timestamp. A given Redis key might look like this:  When incrementing the request count for a given timestamp, we would compare its value to our rate limit to decide whether to reject the request.  We would also tell Redis to expire the key when the current minute passed to ensure that stale counters didn’t stick around forever.
  • 12.
  • 13. RATE LIMIT 2 REQUEST PER MINUTE  A request comes at 00:00:24 belongs to window 1 and it increases the window’s counter to 1.  The next request comes at 00:00:36 also belongs to window 1 and the window’s counter becomes 2.  The next request that comes at 00:00:49 is rejected because the counter has exceeded the limit.  Then the request comes at 00:01:12 can be served because it belongs to window 2.
  • 14.
  • 15. CONS  Let's say that server gets lots of request at 55th second of a minute. this won't work as expected.  For example, if our rate limit were 5 requests per minute and a user made 5 requests at 11:00:59, they could make 5 more requests at 11:01:00 because a new counter begins at the start of each minute. Despite a rate limit of 5 requests per minute, we’ve now allowed 10 requests in less than one minute!
  • 16. SLIDING LOGS  It stores the logs of each request with a timestamp in redis or in memory. For each request.  we could efficiently track all of a user’s requests in a single sorted set. By inserting a new member into the sorted set with a sort value of the Unix microsecond timestamp
  • 17. SCENARIO  When a request comes, we first pop all outdated timestamps before appending the new request time to the log.  Then we decide whether this request should be processed depending on whether the log size has exceeded the limit. For example, suppose the rate limit is 2 requests per minute: 11:59:12 – 00:00:12 11:59:24 – 00:00:24 11:59:36 – 00:00:36 in this minute timeframe third request is rejected 00:00:25 – 00:01:25 – Request older than this time frame is deleted.
  • 18. CONS  While the precision of the sliding window log approach may be useful for a developer API, it leaves a considerably large memory footprint because it stores a value for every request.  Let's say if application receives million request, maintaining log for each request in memory is expensive.
  • 19. SLIDING WINDOW COUNTER  This approach is somewhat similar to sliding logs. Only difference here is, Instead of storing all the logs,we store by grouping user request data based on timestamp.  For example, if we have an hourly rate limit, we increment counters specific to the current Unix minute timestamp and calculate the sum of all counters in the past hour when we receive a new request.  When each request increments a counter in the hash, it also sets the hash to expire an hour later.
  • 20.
  • 21. RATE LIMITING IN DISTRIBUTED SYSTEMS
  • 22. SYNCHRONIZATION POLICIES  We have two load balancer in two different regions coupled with a Rate limiter. And our application is deployed In two node cluster. We have a centralised DB Redis Request Load Balancer Load Balancer RL RL Redis Node 2 Node 1
  • 23. CONSISTENCY ISSUE  The problem we have here is if a user sends request simultaneously in two different load balancer. And If each node were to track its own rate limit, then a consumer could exceed a global rate limit when requests are sent to different nodes via different load balancer.  In fact, the greater the number of nodes, the more likely the user will be able to exceed the global limit.  Assume your rate limit is 5 request per minute. Now for a given user we have server already 4 requests, and the rate limit counter is set to 4 in Redis.  Only one more request could be allowed to serve for that given minute.  Now 2 new Request is being fired simultaneously via different nodes, and both rate limiter pick the latest rate limit counter value from Redis as 4, and presume that they can allow the new request.  Once that happens we break the rule and serve 6 request per minute instead of 5.
  • 24. SOLUTION 1: STICKY SOLUTION  The simplest way to enforce the limit is to set up sticky sessions in your load balancer so that each consumer gets sent to exactly one node.  The disadvantages include a lack of fault tolerance and scaling problems when nodes 2 get overloaded. Request Load Balancer Load Balancer RL RL Redis Node2 Node 1
  • 25. RACE CONDITION ISSUE AND SOLUTION  One of the largest problems with a centralized data store is the potential for race conditions in high concurrency request patterns. This happens when you use a naïve “get-then-set” approach, wherein you retrieve the current rate limit counter, increment it, and then push it back to the datastore. The problem with this model is that in the time it takes to perform a full cycle of read-increment-store, additional requests can come through, each attempting to store the increment counter with an invalid (lower) counter value. This allows a consumer sending a very high rate of requests to bypass rate limiting controls.  One way to avoid this problem is to put a “lock” around the key in Redis, preventing any other processes from accessing or writing to the counter.  However, this would quickly become a major performance bottleneck and add latency as one process should wait unless another process releases its lock.
  • 26. WHAT ELSE IS THE SOLUTION  Eventual Consistency model  Instead of every time fetching the counter from Redis global memory we can have a local memory cache in each node or Rate Limiter and fetch the request count from the local memory.  The counter value stored in local memory will be asynchronously synced global memory and also back this case the local memory of both the respective nodes will be updated and in sync.
  • 27. REFERENCE  https://dev.to/ganeshmani/designing-a-scalable-api-rate-limiter-in-node-js-application-5hg3  https://hechao.li/2018/06/25/Rate-Limiter-Part1/  https://konghq.com/blog/how-to-design-a-scalable-rate-limiting-algorithm/  https://www.youtube.com/watch?v=mhUQe4BKZXs&t=183s