Speaking from experience building a SaaS: users are insane. If you are lucky, they use your service, but in reality, they probably abuse. Crazy usage patterns resulting in more requests than expected, request bursts when users come back to the office after the weekend, and more! These all pose a potential threat to the health of our web application and may impact other users or the service as a whole. Ideally, we can apply some filtering at the front door: limit the number of requests over a given timespan, limiting bandwidth, ...
In this talk, we’ll explore the simple yet complex realm of rate limiting. We’ll go over how to decide on which resources to limit, what the limits should be and where to enforce these limits – in our app, on the server, using a reverse proxy like Nginx or even an external service like CloudFlare or Azure API management. The takeaway? Know when and where to enforce rate limits so you can have both a happy application as well as happy customers.
3. 4
Agenda
Users and traffic patterns
Rate limiting and considerations
Which resources?
Which limits?
Who to limit? Who not to limit?
What when a limit is reached?
Where to limit?
5. 6
MyGet
Hosted private package repository – www.myget.org
NuGet, NPM, Bower, Maven, VSIX, PHP Composer, Symbols, ...
HTTP-based
Web UI for managing things
API for various package managers
PUT/POST – Upload package
DELETE – Delete package via API
GET – Fetch metadata or binary
6. 7
Background workers for scale
Example: package upload
PUT/POST binary and metadata to front-end
PackageAddedEvent on queue with many handlers handled on back-end
ProcessSymbols
UpdateLatestVersion
Indexing
...
7. 8
What could possibly go wrong...
Too many uploads incoming!
Front-end
IIS server needs workers to read the incoming network stream
Application logic has to check credentials, subscription, quota
Back-end
Delays in queue processing (luckily workers can process at their own pace)
Too many uploads that are too slow!
Front-end
IIS server needs lots of workers to slowly copy from the network stream
Workers == threads == memory == synchronization == not a happy place
8. 12
Other examples
Web UI requests
Trying to register spam accounts
Sends a “welcome e-mail”, writes to the datastore
Trying to brute-force login/password reset
Trying to validate credit card numbers via a form on your site
...cost money in the cloud (e.g. per serverless execution)
Robots / Crawlers
Imagine a spider adding 20k items to a shopping cart
For us, usually fine (e.g. Googlebot by default up to 5 req/sec)
Limiting is easy with rel=“nofollow” and robots.txt crawl-delay
11. 15
Rate limiting – what?
Limits # of requests in a given timeframe
Or limits bandwidth, or another resource – up to you
Helps eliminate:
Unexpected traffic patterns
Unwanted traffic patterns (e.g. script kiddie brute-force login)
Potentiallly damaging traffic patterns
(accidental and malicious)
13. 17
Rate limiting – everything???
Everything that could slow down or break your application
Typically everything that depends on a scarce or external resource
CPU
Memory
Disk I/O
Database
External API
So yes, everything...
14. 18
Let’s do this!
Database with table Events
UserIdentifier – who do we limit
ActionIdentifier – what do we limit
When – event timestamp so we can apply a query
Filter attribute
SELECT COUNT(*) FROM Events WHERE UserIdentifier = <user> AND
ActionIdentifier = <action> AND When >= NOW() – X
INSERT INTO Events (<user>, <action>, NOW())
DELETE FROM Events WHERE UserIdentifier = <user> AND
ActionIdentifier = <action> AND When < NOW() – X
17. 21
That database was a bad idea!
Very flexible in defining various limits or doing combinations
Very flexible in changing limits, e.g. changing the time period
The database will suffer at scale...
Every request is at least 2 – 3 queries
Constant index churn
We need to manually run DELETE to remove old events
Database size!
18. 22
That database was a bad idea!
We created a denial of service opportunity!
SELECT, INSERT, DELETE for every request
Consider a simpler technique to limit # of operations
Ideally just a simple counter
“Buckets”
19. 23
Quantized buckets
Create “buckets” per <identifier> and <timespan>
Use incr <bucket> on Redis and get back the current count per <timespan>
public string GetBucketName(string operation, TimeSpan timespan)
{
var bucket = Math.Floor(
DateTime.UtcNow.Ticks / timespan.TotalMilliseconds / 10000);
return $"{operation}_{bucket}";
}
Console.WriteLine(GetBucketName("someaction", TimeSpan.FromMinutes(10)));
// someaction_106062120 <-- this will be the key for +/- 10 minutes
20. 24
Quantized buckets
Super easy and super cheap (atomic write and read on Redis, auto-expire LRU)
Not accurate... (but that may be ok)
(n-1)x2 / 10 sec
Theoretically: max. 6 / 10 sec
21. 25
Leaky bucket
“Imagine a bucket where water is
poured in at the top and leaks from the
bottom.
If the rate at which water is poured in
exceeds the rate at which it leaks, the
bucket overflows.“
Widely used in telecommunications to deal with
bandwidth/bursts.
22. 26
Leaky bucket
Get <delta> tokens, with maximum <count> per <timespan>
public int GetCallsLeft() {
if (_tokens < _capacity) {
var referenceTime = DateTime.UtcNow;
var delta = (int)((referenceTime - _lastRefill).Ticks / _interval.Ticks);
if (delta > 0) {
_tokens = Math.Min(_capacity, _tokens + (delta * _capacity));
_lastRefill = referenceTime;
}
}
return _tokens;
}
25. 30
Things to decide on
Decide on the resources to limit
Decide on a sensible limit
Come up with an identifier to limit on
Decide on exceptions to the rule
28. 33
What are sensible limits?
Approach 1
1. Figure out current # of requests for a certain resource
2. Set limits
3. Get angry phone calls from customers
Approach 2
1. Figure out current # of requests for a certain resource
2. Set limits, but only log when a request would be limited
3. Analyze logs, set new limits, ...
4. Start rate limiting
5. Keep measuring
29. 34
Will you allow bursts or not?
Laddering! Different buckets per identifier and resource...
10 requests per second can be 36000 requests per hour.
But 10 requests per second could also be 1000 requests per hour.
Bucket Operation A Operation B Operation C
Per second 10 10 100
Per minute 60 60 500
Per hour 3600 600 500
...
Steady flow of max.
10/sec
Steady flow of max.
10/sec, but only
600/hour max.
Bursts of up to 100/sec,
but only 500/hour max.
30. 35
What will be the identifier?
Per IP address?
But what with NAT/proxy?
Per user?
But how do you limit anonymous users?
Per session?
But what when the user starts a new session for every request?
Or what if there is no such thing as a session?
Per browser?
But everyone uses Chrome!
31. 36
What will be the identifier?
Probably a combination!
IP address (debatable)
+ User token (or “anonymous”)
+ Session token
+ Headers (user agent + accept-language + some cookie + ...)
32. 37
Decide on exceptions
Do we rate limit all users? Do we have separate limits for certain users?
Dynamic limiting
Do we rate limit all IP addresses?
What about ourselves?
What about our monitoring tools?
What about web crawlers?
What about certain datacenter ranges? (https://github.com/client9/ipcat)
“IP addresses that end web consumers should not be using"
34. 39
What when the user hits the limit?
Do we just “black hole” and close the connection?
Do you tell the user?
API: status code 429 Too Many Requests
Web: error page stating rate limit exceeded / captcha (StackOverflow)
35. 40
Try to always tell the user
Format? Depends on Accept header (text/html vs. application/json)
Tell them why they were throttled
Can be a simple link to API documentation
Tell them when to retry (e.g. GitHub does this even before rate limiting)
Status: 200 OK
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4999
X-RateLimit-Reset: 1372700873
37. 42
Rate limiting – where?
MvcThrottle
Runs as action filter
Requests per timespan
Per action, user, IP, ... (so knows about actions)
Owin.Limits
Runs as OWIN middleware
Bandwidth, concurrent requests, ...
No knowledge about application specifics
Many, many others
39. 44
How far do we allow traffic
before saying no?
KNOWLEDGE ABOUT THE OPERATION
RESOURCES SPENT
40. 45
How far do we allow traffic
before saying no?
KNOWLEDGE ABOUT THE OPERATION
RESOURCES SPENT
41. 46
What options are there?
In our application
ActionFilter / Middleware / HttpModule / ...
Easy to add custom logic, based on request details
On the server
Outside of our server
Outside of our datacenter
42. 47
What options are there?
In our application
On the server
IIS has dynamic IP restrictions, bit rate throttling, <limits />
Kestrel minimum speed throttle
Found these less flexible in terms of configuraton...
E.g. IIS dynamic IP restrictions returns 403 Forbidden, wth!
Not a big fan, as these are usually HttpModules anyway (and thus hit our app)
Outside of our server
Outside of our datacenter
43. 48
What options are there?
In our application
On the server
Outside of our server
Reverse proxy (IIS Application Request Routing, NGinx, HAProxy, Squid, ...)
Traffic does not even hit our application server, yay!
Outside of our datacenter
45. 50
What options are there?
In our application
On the server
Outside of our server
Outside of our datacenter
Azure API management, CloudFlare
Filters traffic very early in the request, yay!
Often also handle DDoS attacks
Often more expensive
48. 53
Imagine...
Your marketing team decided to bridge the physical world with the virtual:
“Use our in-store wifi to join this online contest and win!”
49. 54
Imagine...
Your marketing team decided to bridge the physical world with the virtual:
“Use our in-store wifi to join this online contest and win!”
What if... All those users are NAT-ed from the same IP
And your rate limiting does not allow for 100 simultaneous users from an IP...
50. 55
Monitor your rate limiting!
Monitor what is happening in your application
Who are we rate limiting, when, why
Add circuit breakers (“exceptional exceptions”)
“This flood of requests is fine for now”
52. 57
Conclusion
Users are crazy! (unintentional)
We need rate limiting
Decide on the resources to limit (everything!)
Decide on a sensible limit (measure!)
Come up with an identifier to limit on
Decide on exceptions
What when the user reaches a limit?
Decide where in the request/response flow to limit
Monitor your rate limiting
53. Please rate this session using
Event Master
Mobile App
at the booth
in Lobby
login.developerdays.pl
Prerequisites: create database and make sure it works!
Open demo 01 - DemoLetsDoThis.sln
In Startup.cs explain adding EF context and show how EventsContext is built
Next, show RateLimitFilter is applied to every request
Implementation of RateLimitFilter
Uses an identifier for the user (either User.Identity.Username or “anonymous” + IP address)
Uses ControllerActionDescriptor to determine controller + action
We then check if there are > 5 requests to this resource
We always add an event to the DB – DANGEROUS!!!
And drop older events
Show in Fiddler, requesting: http://localhost:56983/api/hello/maarten
Open MvcThrottle, in project MvcThrottle.Demo
Show HomeController, show EnableThrottling attribute
Run the application - http://localhost:53048/Home/About – see it in action after a few refreshes
Mention we can respond to throttlign depending on the client type!
Open MvcThrottleCustomFilter
See filterContext.HttpContext.Request.AcceptTypes.Any(accept => accept.Contains("html")) -> custom view result
Mention we can filter based on client IP
In FilterConfig.cs, there is an IP whitelist of folks we never want to throttle
Same goes with user agents
Same goes with endpoints
The REALLY nice thing: I can enable/disable per action in MVC
Show BlogController
REALLY NICE, throttling follows my logic
The SAD thing: open 04-snapshot
I did a load test – non scientific!
This thing has low overhead (did a few thousand requests) but still my aplication spent 12% of its time rate limiting requests
Run the nginx docker container from 05-nginx
Show a few requests:
http://localhost:8080/ proxies MyGet
http://localhost:8080/F/googleanalyticstracker/api/v2 proxies a feed
A few refreshes of http://localhost:8080/F/googleanalyticstracker/api/v2 get throttled
So we proxy our app, and get to rate limit some calls, sweet!
Open nginx.conf and go through some boiler-plate:
Worker processes and worker connections (typically == to # cores)
http section sets up a web server, we can add SSL etc here as well
Under server, we define the different resources
/ just proxies www.myget.org and injects a header
/Content proxies and caches (yay added bonus of NGinx)
/F/ is where things get interesting – we limit requests to this one using “mylimit”
Defines a key, names a zone, names the timespan, names the limit
Can mix and match to create key: limit_req_zone $binary_remote_addr$http_authorization zone=mylimit:10m rate=2r/s;
Prerequisites
Create Azure API management (!!! Well upfront – takes time !!!)
Force-push the 06-apim repo to itgit remote add origin .......<new url>.......git push --force --set-upstream origin master
Show portal – especially “API” and “PRODUCT”
“API” defines API calls. From portal, show we can create this based on a Swagger definition
For demo here, created manually and registered /F/* and /* to just pass-through
Under products
Show anonymous and unlimited
Explain the idea of API management is to sell access to your API and allow people to purchase a product to get better/less/… access to an API
Anonymous is all I’ll use during the demo
Anonymous has policies – show rate limit is 100 per 60 sec
From PUBLISHER PORTAL (click), we have a policy for –Feed endpoint as well, which is more strict
Show https://ratelimitingdemo.azure-api.net/ is smooth
Show a few refreshes of https://shit.azure-api.net/F/googleanalyticstracker/api/v2/ limited
Requests that are limited never hit my server