API rate limiting
Rate limiting stops abuse: brute-force logins, scraping public catalogs, expensive admin requests, and flooding notification webhooks. Policies live in MongoDB and are editable from the admin app (super admins) without redeploying the server.
How a request flows
- CORS — browser origin check.
requestIdMiddleware— assigns or honorsX-Request-Id.rateLimitMiddleware— loads cached config, matches policies, updates counters, may return 429.- Route + handler — auth, validation, business logic.
If global rate limiting is disabled (enabled: false in config), step 3 does nothing useful beyond skipping work.
Glossary (quick reference)
| Term | What it means |
|---|---|
| Policy | One rule: which URLs, which HTTP methods, how many requests allowed in a window, how to count them, and whether to block. |
| Window | A time slice (windowSeconds). Counts reset or roll when the window advances. |
| Limit | Max allowed “usage” in that policy’s sense (see Algorithms below). |
| Identity | Who we count: by IP, by logged-in user id, both (prefer user), or internal worker. |
| Mode | off / shadow / enforce-soft / enforce — whether we only observe, soft-block, or hard-block. |
| Store | Where counters live: memory (single process) or mongo (shared across instances). |
| Config cache | In-process copy of the Mongo config; refreshes about every 30 seconds after load or invalidation. |
Policy fields (what each thing does)
Each policy in the database (and in the admin UI) has these fields:
| Field | Purpose |
|---|---|
id | Stable unique key (e.g. users.read). Used in logs, events, and API paths. |
name | Human label in admin UI. |
routeGroup | Tag for reporting (groups related limits in monitoring). |
pathPrefixes | List of URL path prefixes. A request matches if its path equals a prefix or starts with prefix + '/'. Example: /api/v1/users matches /api/v1/users/me. |
methods | Optional. If set, only these HTTP methods match (uppercase in storage, e.g. GET, POST). If omitted, any method matches the paths. |
identity | ip — count per client IP. user — per Bearer access token user id (decoded without running full route auth). user_or_ip — use user id when present, else IP. internal — only applies when x-internal-key is present (worker calls). |
windowSeconds | Length of the rate window in seconds (for fixed / sliding) or the period used to derive token refill rate for token_bucket. |
limit | Maximum count (fixed/sliding) or bucket capacity (token bucket). |
algorithm | fixed — count requests in the current window. sliding — smoother estimate using current + previous window. token_bucket — burst-friendly refill (good for expensive endpoints). |
mode | off — ignore. shadow — count and log “would block” but always allow the request. enforce-soft — allow up to about 3× the limit before blocking. enforce — block when over limit. |
weight | When several policies match, they are evaluated in weight order (higher first). The most restrictive outcome (lowest remaining quota or blocked) wins for the response. |
allowlist | List of raw identity strings (e.g. ip:203.0.113.5, user:<mongoId>) that skip this policy. |
Super-admin bypass
If the user is a super admin (isAdmin: true on the user document), policies with identity: user are skipped for that user. IP-based and user_or_ip policies still apply when falling back to IP.
GET /health and GET /ready are never rate limited.
Client IP detection
The limiter prefers CF-Connecting-IP (Cloudflare), otherwise the last entry in X-Forwarded-For (typical reverse-proxy shape). If neither is present, identity may show as ip:unknown.
Algorithms (what each one does)
| Algorithm | Behavior | Good for |
|---|---|---|
fixed | One counter per identity per time bucket (e.g. per minute). Simple; can allow short bursts at bucket edges. | Auth walls, coarse caps. |
sliding | Uses current window count plus a weighted fraction of the previous window — closer to “rolling” fairness. | Public read APIs. |
token_bucket | Tokens refill at rate limit / windowSeconds per second; each request consumes one token up to limit. Allows controlled bursts. | Expensive admin POSTs. |
Modes (safety ramp)
| Mode | Effect |
|---|---|
off | Policy does not run. |
shadow | Counts and records shadow_violation events but never returns 429 for that policy. |
enforce-soft | Blocks only after roughly triple the configured limit. |
enforce | Blocks when the limit is exceeded. |
Global switch: enabled: false on the singleton config turns rate limiting off for the whole app (after cache refresh).
Environment variables
Set in hono-backend .env or process env:
| Variable | Default | Meaning |
|---|---|---|
RATE_LIMIT_STORE | memory | memory = counters in Node RAM (fast; single instance; lost on restart). mongo = counters in MongoDB (shared; survives restart; use with multiple app instances). |
RATE_LIMIT_EVENT_SAMPLE_RATE | 0.01 | Fraction of allowed requests that write a sampled event (0–1). Blocked and shadow outcomes are recorded without sampling. |
MongoDB collections (what persists where)
| Collection / model | Role |
|---|---|
RateLimitConfig (key: 'global') | The enabled flag and full policies[] array. Seeded on first access if missing. |
RateLimitCounter | Per-key counters and token-bucket state when RATE_LIMIT_STORE=mongo. TTL on documents cleans old windows. |
RateLimitEvent | Optional audit stream: blocked, shadow violations, sampled allows. Short TTL (e.g. ~14 days). |
RateLimitRollup | Daily aggregates for dashboards (may be sparse until a job fills them). |
RateLimitAudit | Admin change log: who changed config/policies and before/after snapshots. |
Admin UI: how to use it
Requires admin-frontend logged in as staff.
- Rate Limits (
/rate-limits) — table of policies; select one to edit limit, window, mode, algorithm, identity, prefixes, allowlist. Toggle Enable all / Disable all for the global switch. Recent changes shows audit entries. - Rate Limit Monitoring (
/rate-limits/monitoring) — rollups and recent events (blocked vs shadow vs sampled allowed).
Permissions:
VIEW_STATS— read config summary, events, metrics (GETendpoints).MANAGE_RATE_LIMITS— super admin only in the backend role map: change policies, global enable, create/delete policy rows, dry-run test.
If your account is only manager / creator, you will not see the Rate Limits nav items unless you are isAdmin: true.
HTTP API (for automation or curl)
Base path: /api/v1/admin/rate-limits. All require Bearer access token + admin middleware.
| Method | Path | Permission | Returns / does |
|---|---|---|---|
GET | / | VIEW_STATS | { config, lastRollups, auditLog } |
GET | /events?policy=&outcome=&limit= | VIEW_STATS | Paginated event list + total |
GET | /metrics?from=&to= | VIEW_STATS | { rollups } filtered by dayKey range |
PATCH | / | MANAGE_RATE_LIMITS | Body { enabled?, policies? } — updates singleton + invalidates cache |
POST | /policies | MANAGE_RATE_LIMITS | Body: full policy — upserts by id |
PATCH | /policies/:id | MANAGE_RATE_LIMITS | Partial policy fields |
DELETE | /policies/:id | MANAGE_RATE_LIMITS | Removes policy |
POST | /test | MANAGE_RATE_LIMITS | Body { method, path } — returns which policies would match (no counters changed) |
What clients see when blocked
- Status
429 Too Many Requests - Headers:
RateLimit-Limit,RateLimit-Remaining,RateLimit-Reset, andRetry-After(seconds) - Body (JSON):
error,code: "RATE_LIMITED",requestId,policy(policy id),retryAfterSeconds
Browsers must have these headers exposed by CORS; index.ts adds them to exposeHeaders.
Fail-open behavior
If the limiter throws (e.g. DB timeout), the middleware logs RATE_LIMITER_DEGRADED and allows the request so a broken limiter does not take down the API.
Default policies (seed values)
These are the starting policies in code (rate-limit.defaults.ts). After first DB seed, the database copy is authoritative; edit via admin UI or PATCH.
| Policy id | Paths / notes | Identity | Window | Limit | Algorithm | Default mode |
|---|---|---|---|---|---|---|
auth.login.minute | POST auth login/refresh | IP | 60s | 10 | fixed | enforce |
auth.login.hour | same | IP | 3600s | 100 | fixed | enforce |
users.read | GET /api/v1/users | user | 60s | 120 | fixed | enforce |
users.write | mutating /api/v1/users | user | 60s | 30 | fixed | enforce |
notifications.open | POST notification opened | user | 60s | 60 | fixed | enforce |
practice.public | practice-pyq / practice-yt | user_or_ip | 60s | 120 | sliding | enforce |
current-affairs.public | current-affairs | user_or_ip | 60s | 120 | sliding | enforce |
admin.read | GET /api/v1/admin | user | 60s | 240 | fixed | shadow |
internal.notifications | /internal/notifications | internal | 60s | 600 | fixed | shadow |
global | /api/v1, /internal fallback | user_or_ip | 60s | 600 | fixed | enforce |
Tune limits in production based on monitoring and product needs. Use shadow first for anything risky, then enforce.
Code map (where to read in hono-backend)
| Area | Path |
|---|---|
| Middleware mount + CORS headers | src/index.ts |
| Limiter middleware | src/middleware/rate-limit.middleware.ts |
| Matching + algorithms | src/services/rate-limit/policy-engine.ts, algorithms.ts |
| Stores | src/services/rate-limit/memory-store.ts, mongo-store.ts, create-store.ts |
| Config cache | src/services/rate-limit/config-cache.ts |
| Defaults | src/config/rate-limit.defaults.ts |
| Admin routes | src/routes/admin-rate-limits.routes.ts |
| Permission constant | src/config/permissions.ts — MANAGE_RATE_LIMITS |
Kill switches (incidents)
- Set the noisy policy
modetooffand save (super admin). - Or disable globally: Disable all in Rate Limits UI (or
PATCHwithenabled: false). - Optionally switch
RATE_LIMIT_STOREand restart if the counter backend misbehaves.
Related: Architecture overview · Layers · Route map