API rate limiting

Rate limiting stops abuse: brute-force logins, scraping public catalogs, expensive admin requests, and flooding notification webhooks. Policies live in MongoDB and are editable from the admin app (super admins) without redeploying the server.

How a request flows

CORS — browser origin check.
requestIdMiddleware — assigns or honors X-Request-Id.
rateLimitMiddleware — loads cached config, matches policies, updates counters, may return 429.
Route + handler — auth, validation, business logic.

If global rate limiting is disabled (enabled: false in config), step 3 does nothing useful beyond skipping work.

Glossary (quick reference)

Term	What it means
Policy	One rule: which URLs, which HTTP methods, how many requests allowed in a window, how to count them, and whether to block.
Window	A time slice (`windowSeconds`). Counts reset or roll when the window advances.
Limit	Max allowed “usage” in that policy’s sense (see Algorithms below).
Identity	Who we count: by IP, by logged-in user id, both (prefer user), or internal worker.
Mode	`off` / `shadow` / `enforce-soft` / `enforce` — whether we only observe, soft-block, or hard-block.
Store	Where counters live: memory (single process) or mongo (shared across instances).
Config cache	In-process copy of the Mongo config; refreshes about every 30 seconds after load or invalidation.

Policy fields (what each thing does)

Each policy in the database (and in the admin UI) has these fields:

Field	Purpose
`id`	Stable unique key (e.g. `users.read`). Used in logs, events, and API paths.
`name`	Human label in admin UI.
`routeGroup`	Tag for reporting (groups related limits in monitoring).
`pathPrefixes`	List of URL path prefixes. A request matches if its path equals a prefix or starts with `prefix + '/'`. Example: `/api/v1/users` matches `/api/v1/users/me`.
`methods`	Optional. If set, only these HTTP methods match (uppercase in storage, e.g. `GET`, `POST`). If omitted, any method matches the paths.
`identity`	`ip` — count per client IP. `user` — per Bearer access token user id (decoded without running full route auth). `user_or_ip` — use user id when present, else IP. `internal` — only applies when `x-internal-key` is present (worker calls).
`windowSeconds`	Length of the rate window in seconds (for fixed / sliding) or the period used to derive token refill rate for token_bucket.
`limit`	Maximum count (fixed/sliding) or bucket capacity (token bucket).
`algorithm`	`fixed` — count requests in the current window. `sliding` — smoother estimate using current + previous window. `token_bucket` — burst-friendly refill (good for expensive endpoints).
`mode`	`off` — ignore. `shadow` — count and log “would block” but always allow the request. `enforce-soft` — allow up to about 3× the limit before blocking. `enforce` — block when over limit.
`weight`	When several policies match, they are evaluated in weight order (higher first). The most restrictive outcome (lowest remaining quota or blocked) wins for the response.
`allowlist`	List of raw identity strings (e.g. `ip:203.0.113.5`, `user:<mongoId>`) that skip this policy.

Super-admin bypass

If the user is a super admin (isAdmin: true on the user document), policies with identity: user are skipped for that user. IP-based and user_or_ip policies still apply when falling back to IP.

GET /health and GET /ready are never rate limited.

Client IP detection

The limiter prefers CF-Connecting-IP (Cloudflare), otherwise the last entry in X-Forwarded-For (typical reverse-proxy shape). If neither is present, identity may show as ip:unknown.

Algorithms (what each one does)

Algorithm	Behavior	Good for
`fixed`	One counter per identity per time bucket (e.g. per minute). Simple; can allow short bursts at bucket edges.	Auth walls, coarse caps.
`sliding`	Uses current window count plus a weighted fraction of the previous window — closer to “rolling” fairness.	Public read APIs.
`token_bucket`	Tokens refill at rate `limit / windowSeconds` per second; each request consumes one token up to `limit`. Allows controlled bursts.	Expensive admin POSTs.

Modes (safety ramp)

Mode	Effect
`off`	Policy does not run.
`shadow`	Counts and records shadow_violation events but never returns 429 for that policy.
`enforce-soft`	Blocks only after roughly triple the configured limit.
`enforce`	Blocks when the limit is exceeded.

Global switch: enabled: false on the singleton config turns rate limiting off for the whole app (after cache refresh).

Environment variables

Set in hono-backend .env or process env:

Variable	Default	Meaning
`RATE_LIMIT_STORE`	`memory`	`memory` = counters in Node RAM (fast; single instance; lost on restart). `mongo` = counters in MongoDB (shared; survives restart; use with multiple app instances).
`RATE_LIMIT_EVENT_SAMPLE_RATE`	`0.01`	Fraction of allowed requests that write a sampled event (`0`–`1`). Blocked and shadow outcomes are recorded without sampling.

MongoDB collections (what persists where)

Collection / model	Role
`RateLimitConfig` (`key: 'global'`)	The enabled flag and full `policies[]` array. Seeded on first access if missing.
`RateLimitCounter`	Per-key counters and token-bucket state when `RATE_LIMIT_STORE=mongo`. TTL on documents cleans old windows.
`RateLimitEvent`	Optional audit stream: blocked, shadow violations, sampled allows. Short TTL (e.g. ~14 days).
`RateLimitRollup`	Daily aggregates for dashboards (may be sparse until a job fills them).
`RateLimitAudit`	Admin change log: who changed config/policies and before/after snapshots.

Admin UI: how to use it

Requires admin-frontend logged in as staff.

Rate Limits (/rate-limits) — table of policies; select one to edit limit, window, mode, algorithm, identity, prefixes, allowlist. Toggle Enable all / Disable all for the global switch. Recent changes shows audit entries.
Rate Limit Monitoring (/rate-limits/monitoring) — rollups and recent events (blocked vs shadow vs sampled allowed).

Permissions:

VIEW_STATS — read config summary, events, metrics (GET endpoints).
MANAGE_RATE_LIMITS — super admin only in the backend role map: change policies, global enable, create/delete policy rows, dry-run test.

If your account is only manager / creator, you will not see the Rate Limits nav items unless you are isAdmin: true.

HTTP API (for automation or curl)

Base path: /api/v1/admin/rate-limits. All require Bearer access token + admin middleware.

Method	Path	Permission	Returns / does
`GET`	`/`	`VIEW_STATS`	`{ config, lastRollups, auditLog }`
`GET`	`/events?policy=&outcome=&limit=`	`VIEW_STATS`	Paginated event list + total
`GET`	`/metrics?from=&to=`	`VIEW_STATS`	`{ rollups }` filtered by `dayKey` range
`PATCH`	`/`	`MANAGE_RATE_LIMITS`	Body `{ enabled?, policies? }` — updates singleton + invalidates cache
`POST`	`/policies`	`MANAGE_RATE_LIMITS`	Body: full policy — upserts by `id`
`PATCH`	`/policies/:id`	`MANAGE_RATE_LIMITS`	Partial policy fields
`DELETE`	`/policies/:id`	`MANAGE_RATE_LIMITS`	Removes policy
`POST`	`/test`	`MANAGE_RATE_LIMITS`	Body `{ method, path }` — returns which policies would match (no counters changed)

What clients see when blocked

Status 429 Too Many Requests
Headers: RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, and Retry-After (seconds)
Body (JSON): error, code: "RATE_LIMITED", requestId, policy (policy id), retryAfterSeconds

Browsers must have these headers exposed by CORS; index.ts adds them to exposeHeaders.

Fail-open behavior

If the limiter throws (e.g. DB timeout), the middleware logs RATE_LIMITER_DEGRADED and allows the request so a broken limiter does not take down the API.

Default policies (seed values)

These are the starting policies in code (rate-limit.defaults.ts). After first DB seed, the database copy is authoritative; edit via admin UI or PATCH.

Policy id	Paths / notes	Identity	Window	Limit	Algorithm	Default mode
`auth.login.minute`	POST auth login/refresh	IP	60s	10	fixed	enforce
`auth.login.hour`	same	IP	3600s	100	fixed	enforce
`users.read`	GET `/api/v1/users`	user	60s	120	fixed	enforce
`users.write`	mutating `/api/v1/users`	user	60s	30	fixed	enforce
`notifications.open`	POST notification opened	user	60s	60	fixed	enforce
`practice.public`	practice-pyq / practice-yt	user_or_ip	60s	120	sliding	enforce
`current-affairs.public`	current-affairs	user_or_ip	60s	120	sliding	enforce
`admin.read`	GET `/api/v1/admin`	user	60s	240	fixed	shadow
`internal.notifications`	`/internal/notifications`	internal	60s	600	fixed	shadow
`global`	`/api/v1`, `/internal` fallback	user_or_ip	60s	600	fixed	enforce

Tune limits in production based on monitoring and product needs. Use shadow first for anything risky, then enforce.

Code map (where to read in `hono-backend`)

Area	Path
Middleware mount + CORS headers	`src/index.ts`
Limiter middleware	`src/middleware/rate-limit.middleware.ts`
Matching + algorithms	`src/services/rate-limit/policy-engine.ts`, `algorithms.ts`
Stores	`src/services/rate-limit/memory-store.ts`, `mongo-store.ts`, `create-store.ts`
Config cache	`src/services/rate-limit/config-cache.ts`
Defaults	`src/config/rate-limit.defaults.ts`
Admin routes	`src/routes/admin-rate-limits.routes.ts`
Permission constant	`src/config/permissions.ts` — `MANAGE_RATE_LIMITS`

Kill switches (incidents)

Set the noisy policy mode to off and save (super admin).
Or disable globally: Disable all in Rate Limits UI (or PATCH with enabled: false).
Optionally switch RATE_LIMIT_STORE and restart if the counter backend misbehaves.

Related: Architecture overview · Layers · Route map