If a system is required to handle short bursts of requests from the same users, the previous post strategy will not work. One solution in this case is what is called a Token Bucket. Each user starts with n tokens available, each requests consumes a token. Upon reaching 0 tokens, requests are denied. Tokens are replenished in a steady rate.
This post shows a implementation of rate limiting aimed at averaging 10 requests per second, with bursts of up to 10 requests (refilling at a rate of 1 token per second).
create unlogged table rate_limit( id int not null,
tokens smallint not null
default 9 check(tokens >= 0 and
tokens <= 10)...
If a system is required to handle short bursts of requests from the same users, the previous post strategy will not work. One solution in this case is what is called a Token Bucket. Each user starts with n tokens available, each requests consumes a token. Upon reaching 0 tokens, requests are denied. Tokens are replenished in a steady rate.
This post shows a implementation of rate limiting aimed at averaging 10 requests per second, with bursts of up to 10 requests (refilling at a rate of 1 token per second).
create unlogged table rate_limit( id int not null,
tokens smallint not null
default 9 check(tokens >= 0 and
tokens <= 10),
primary key (id)
);
create index on rate_limit
using btree(tokens)
where tokens < 10;
1: Unlogged table for faster writes; Please do change all your tables to unlogged without understanding what you lose =);
2: Inserting a new row occurs when a request is processed, tokens defaults to 9 (allowing 10 requests in total: the current one plus 9 remaining);
3: Partial indexing: keep (indexed) only rows that should be refilled.
Some SQL-fu, run for each request:
\set id random(0, 10000)
with ins_or_upt as
(
insert into rate_limit as x (id)
values (:id) on conflict(id) do update set tokens = x.tokens - 1
where x.id = excluded.id
and x.tokens > 0
returning tokens )
select tokens
from ins_or_upt
union all select 0
limit 1
1: The above snippet is used with pgbench later, hence the use of \set command;
2: Using the variable set at line 1;
3: If a row already exists for :id, update instead;
4: Returns the number of remaining tokens, useful for the user;
5: Always return exactly one row containing the remaining tokens for :id.
We can consume tokens, but they are never refilled (yet). The system’s backend could, of course, periodically issue a statement to refill tokens one by one, but that’s not this blog’s theme. Enter pg_cron, an extension that does just that: periodically executes a statement. After installing it, we just schedule the following:
select cron.schedule(
'1 second',
$$ update rate_limit
set tokens = tokens + 1
where tokens < 10 $$
);
Done. This refills 1 token per second for rows with less than 10 tokens, capping at 10 (full bucket after 10 seconds of inactivity). All of this with no backend code.
Just for fun, I ran the insert script via pgbench using a single connection, single client, single thread, simulating 10k concurrent users.
docker exec -i pg pgbench \
--time=30 \
--client=1 \
--jobs=1 \
--protocol=prepared \
--no-vacuum \
--file=insert.sql
You’re right, that’s not a proper benchmark. Still, achieving ~20k transactions per second (tps) with my (okayish) setup and stock PostgreSQL configuration makes me wonder why some (NoSQL) database systems have such widespread adoption.
Figure 1: me changing a table to unlogged
If you are wondering about how to run this yourself, download this files:
Place everything in the same directory and run: docker compose up and then execute pgbench.