A startup I know got their /login endpoint hammered with 40,000 requests in under 3 minutes. No rate limiting, no lockout, no alerting. The attacker was credential-stuffing with a leaked password list. By the time anyone noticed, 12 accounts were compromised. The fix took 2 hours to deploy. The damage took weeks to contain.
Rate limiting is one of those things developers always plan to add "later." This article is about making sure later is now.
The Fixed Window Problem (The Gotcha Junior Devs Miss)
Fixed window rate limiting resets at a fixed interval — e.g., "100 requests per minute, counter resets at :00 and :01."
The bypass: send 100 requests at 12:00:59, then another 100 at 12:01:01. You've sent 200 requests in 2 seconds and bypassed the limit entirely.
Use sliding window instead. A sliding window tracks requests within the last N seconds from now, not from the start of the current clock interval. It's marginally more expensive to compute (requires storing timestamps or using a sorted set in Redis), but it eliminates this bypass completely.
Most production rate-limiting libraries use sliding window by default — but verify yours does.
Where to Apply Rate Limiting
Not all endpoints are equal. Focus your tightest limits here:
| Endpoint | Recommended Limit | Notes |
|---|---|---|
POST /login | 5 attempts / 15 min per IP+username | Most critical |
POST /register | 10 / hour per IP | Prevents account farming |
POST /password-reset | 3 / hour per IP+email | Reset token enumeration |
POST /verify-otp | 3 attempts then lock | OTP has small keyspace |
GET /api/* (general) | 100-500 / min per token | DDoS mitigation |
OTP endpoints deserve special attention. A 6-digit OTP has 1,000,000 possible values. At 10 requests/second with no limiting, an attacker cracks it in under 2 minutes. Three attempts then hard lock is the correct call here.
Implementation
The core pattern is: track request count by key (IP + identifier), check against threshold, return 429 if exceeded.
Progressive Delays vs. Hard Lockout
Hard account lockout (block after N failures) is a denial-of-service vector — an attacker can lock out every user in your system just by sending bad credentials. Don't do this without thinking carefully.
Progressive delays are better for most cases:
- 1st failure: immediate
- 2nd failure: 1 second delay
- 3rd failure: 5 second delay
- 4th failure: 30 second delay
- 5th+ failure: 15 minute soft lock
This makes automated brute force practically infeasible while keeping the experience reasonable for real users who mistype their password.
Hard lock only makes sense for high-value targets like OTP endpoints or admin panels, and even then, consider locking the session rather than the account.
Don't Forget These Response Headers
A 429 without Retry-After is rude and forces clients to guess when to retry — leading to thundering herd problems when many clients hammer the endpoint simultaneously. Always return:
HTTP/1.1 429 Too Many Requests
Retry-After: 900
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1710374400
The X-RateLimit-* headers aren't standardized (RFC 6585 only covers the 429 status code), but they're widely expected by API clients and SDKs.
The Distributed System Edge Case
If you're running multiple instances of your API behind a load balancer and storing rate limit state in-memory, each instance has its own counter. An attacker can send 5 requests to each of your 4 instances = 20 effective attempts before hitting any limit.
Always store rate limit state in Redis (or another shared store) in distributed deployments. In-memory rate limiting is only acceptable for single-instance or development environments.
A Note on IP-Based Limiting Alone
IP-only rate limiting breaks for:
- Corporate networks (many users sharing one IP)
- Mobile users (IP changes mid-session)
- Attackers using residential proxy networks
Combine IP + user identifier (username, email, user_id) for auth endpoints. For unauthenticated endpoints where you don't have a user identifier, IP is your only option — just set the threshold generously enough to avoid false positives on shared networks.
Implement rate limiting in this order of priority:
POST /login— sliding window, 5 attempts per 15 minutes, keyed by IP + usernamePOST /verify-otp— 3 attempts then hard session lockPOST /password-reset— 3 per hour per IP + email- All other auth endpoints — sensible limits before moving on
- General API endpoints — once auth is protected
Back your counters with Redis in production. Return proper 429 responses with Retry-After. Skip fixed window — use sliding window. And add alerting so you know when an attack is happening in the first place — rate limiting slows attackers down, but you still want to know they're knocking.
Frequently Asked Questions
Related posts
Secure Password Reset Tokens — Expiry, Storage, and What Most Implementations Get Wrong
A practical guide to building secure password reset flows: token generation, expiry windows, one-time use enforcement, and the edge cases that cause real account takeovers.
Mar 30, 2026 · 7 min readIncident Response for Developers: What to Do When You Get Hacked
A practical incident response guide for developers covering detection, containment, eradication, recovery, and communication when a security breach happens.
Mar 29, 2026 · 9 min readPhishing Prevention: A Developer's Guide to SPF, DKIM, and DMARC
Understand how email spoofing enables phishing attacks and how to implement SPF, DKIM, and DMARC to protect your domain from being impersonated.
Mar 29, 2026 · 9 min read